Integrative analysis of transcriptomic data to elucidate regulators of pre-mRNA processing

Loading...
Thumbnail Image
Degree type
Doctor of Philosophy (PhD)
Graduate group
Genomics and Computational Biology
Discipline
Bioinformatics
Subject
Alternative polyadenylation
Alternative splicing
DDX55
RNA binding proteins
RNA-seq
Transcriptomics
Funder
Grant number
License
Copyright date
2025
Distributor
Related resources
Author
Gazzara, Matthew, R
Contributor
Abstract

Alternative pre-mRNA processing events, including alternative splicing (AS) and alternative polyadenylation (APA), are key drivers of transcriptomic and proteomic diversity. This dissertation presents a comprehensive framework for analyzing these events, emphasizing improvements to computational tools and workflows that enhance the accuracy, accessibility, and interpretability of transcriptomic data across diverse high-throughput sequencing modalities, including RNA-seq, targeted 3' end sequencing, and CLIP-seq. Chapters 2 and 3 focus on the development and application of MAJIQ v2, a co-first author contribution that advances splicing analysis by enabling more accurate identification and quantification of both simple and complex AS events. A key innovation introduced is the MAJIQ v2 Modulizer, which facilitates regulatory analysis by decomposing complex splicing patterns into discrete, interpretable modules made up of binary AS event building blocks. This modular representation, combined with MAJIQ’s superior accuracy and ability to handle intron retention enables more robust splicing quantification across tissues (GTEx) and improves the discovery of splicing QTLs (sQTLs), as demonstrated by validated MAJIQ-sQTLs in \textit{CYP11B1} (Chapter 3). Chapter 4 presents benchmarking work from APAeval, highlighting methodological advances for the identification and quantification of APA from RNA-seq data. In Chapter 5, ENCODE RBP knockdown data are uniformly processed with DaPars to systematically profile the impact of RNA-binding proteins on 3'UTR isoform diversity. This integrative analysis uncovers several novel regulators of APA, most notably the RNA helicase DDX55, and provides the community with a standardized analysis resource for investigating APA regulation by RBPs. Finally, Chapter 6 introduces the Comparative Analysis of Alternative RNA Processing (CAARP), a lightweight framework and repository of reproducible analysis scripts in the form of iPython notebooks that support flexible and extensible exploration of regulatory features and RBPs that may play a role in user-defined sets of AS and APA events from RNA-seq data. CAARP has already been applied in multiple published studies and offers a scalable foundation for future regulatory analyses of pre-mRNA processing across varied transcriptomic datasets.

Advisor
Barash, Yoseph
Lynch, Kristen, W
Date of degree
2025
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation