INTERPRETING PERSONAL GENOMES WITH LONG-READ TRANSCRIPTOME SEQUENCING

Loading...
Thumbnail Image
Degree type
PhD
Graduate group
Genomics and Computational Biology
Discipline
Genetics and Genomics
Bioinformatics
Medicine and Health Sciences
Subject
Funder
Grant number
License
Copyright date
01/01/2025
Distributor
Related resources
Author
Wang, Robert
Contributor
Abstract

With the cost of sequencing a human genome having plummeted from $3 billion during the Human Genome Project to $600 today, plans to sequence millions of people in the developed world are now plausible. The era of “personal genomes” has steadily enabled more precise approaches to diagnosing, treating, and preventing a broad swath of human diseases. Yet, despite its immense promise, personal genomics is constrained by a long-standing, fundamental challenge of variant interpretation. While our genomes are rich with variation, our ability to interpret the functional and clinical impact of each variant remains limited. One widely adopted strategy for improving variant interpretation is the integration of RNA sequencing (RNA-seq) data, which captures sequence variation in exonic regions, as well as changes in gene expression and splicing induced by regulatory variants. Over the past decade, such approaches have provided new insights into the genetic and molecular etiology of common diseases, rare diseases, and cancer. Currently, with the rapid evolution of long-read technologies, it is now possible to sequence entire transcript molecules and determine their haplotype of origin ─ a breakthrough that will further enhance our ability to study the cis effects of genetic variants on the transcriptome. However, large-scale clinical applications of long-read RNA-seq remain in their infancy due to limitations in sequencing cost, throughput, and accuracy. Here, we present the development of technological and computational strategies that can overcome these limitations. Aided by these approaches, we then demonstrate the utility of long-read RNA-seq in advancing personal genome interpretation, using rare monogenic diseases as a model system for human genetics. First, we demonstrate how long-read RNA-seq applied to minimally invasive tissues from patients with rare metabolic disorders enables both the discovery of pathogenic variants ─ including those missed by standard genetic testing ─ and characterization of their molecular consequences. We then explore how transcript expression information from long-read RNA-seq data on healthy human tissues can improve clinical interpretation of putative loss-of-function variants in haploinsufficient genes and moreover, inform the development of potential therapeutic strategies for diseases caused by gene haploinsufficiency.

Advisor
Xing, Yi
Date of degree
2025
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation