Characterizing The Gene Networks Associated With Non-Coding Elements In Pediatric Cancer Using Integrative Genomics

Apexa Modi, University of Pennsylvania

Abstract

There is a need to better understand non-coding elements that drive pediatric cancers. Long non-coding RNAs (lncRNAs) play an important role in gene regulation and contribute to tumorigenesis; however, which lncRNAs are expressed in pediatric cancer histotypes and whether any are common drivers still remains unknown. Here, we curate RNA sequencing data for 1,044 pediatric leukemia and solid tumors and integrate paired tumor whole genome sequencing and epigenetic data in relevant cell line models to explore lncRNA expression, regulation, and association with cancer. We report a total of 2,657 robustly expressed lncRNAs across six pediatric cancers, including 1,142 exhibiting histotype-specific expression. Next, a multi-dimensional framework was applied to identify and prioritize lncRNAs impacting gene networks, which revealed that lncRNAs dysregulated in pediatric cancer are associated with proliferation, metabolism, and DNA damage hallmarks. Altogether these analyses were integrated to prioritize lncRNAs for experimental validation, and we showed that silencing of TBX2-AS1, our top-prioritized neuroblastoma-specific lncRNA, resulted in significant growth inhibition of neuroblastoma cells, confirming our computational predictions. Taken together, these data provide a comprehensive characterization of lncRNA regulation and function in pediatric cancers and pave the way for future mechanistic studies. In addition to non-coding RNA, non-coding genetic variation can also play a role in driving pediatric cancer. In a second study, we focus on understanding the impact of non-coding genetic variation in neuroblastoma susceptibility and progression. Neuroblastoma is the most common extra-cranial solid pediatric cancer. Its low somatic mutation burden is thought to be driven, in part, by germline variation. Our neuroblastoma genome wide association study (GWAS) has identified nineteen loci associated with disease and confirmed that the majority of associated variants are non-coding. We analyzed histone ChIP-seq, ATAC-seq, and high resolution promoter Capture C to prioritize variants that are likely to impact regulatory regions including promoters and enhancers. This integrative approach not only revealed new genes associated with neuroblastoma susceptibility, but also nominated several causal variants based on predicted impact on transcription factor binding and gene expression. Altogether our studies provide actionable hypotheses about how non-coding elements such as lncRNAs and common variation can driver pediatric cancer.