Date of Award
Doctor of Philosophy (PhD)
Genomics & Computational Biology
RNA alternative splicing is primarily responsible for transcriptome diversity and is relevant to human development and disease. However, current approaches to splicing quantication make simplifying assumptions which are violated when RNA sequencing data are heterogeneous. Influences from genetic and environmental background contribute to variability within a group of samples purported to represent the same biological condition. This work describes three methods which account for data heterogeneity when detecting differential RNA splicing between sample groups. First, a robust model is implemented for outlier detection within a group of purported replicates. Next, large RNA-seq datasets with high within-group variability are addressed with a statistical approach which retains power to detect changing splice junctions without sacricing specicity. Finally, applying these tools to call sQTLs in GTEx tissues has identified splicing variations associated with risk loci for cardiovascular disease and anomalous skeletal development. Each of these methods correctly handles the properties of heterogeneous RNA-seq data to improve precision and reduce false discovery rate.
Norton, Scott Simon, "Methods For Robust Quantification Of Rna Alternative Splicing In Heterogeneous Rna-Seq Datasets" (2019). Publicly Accessible Penn Dissertations. 3460.
Additional Filestwist1-supp-tables.xlsx (39339 kB)
validations_51pct.tsv (8 kB)
validations_100pct.tsv (8 kB)