Date of Award
Doctor of Philosophy (PhD)
Genomics & Computational Biology
Genetic and epigenetic alterations combine to drive cancer progression. Heterogeneous cell populations within tumors are associated with poor prognosis and outcomes. Copy number aberrations (CNAs), a genetic variant commonly occurring in tumors, are used as markers to detect subclones and reconstruct tumor phylogeny. Multi-omics integration between CNAs and other modalities on tumor subclones facilitates studying the interplay between genome and epigenome, and their effects on transcriptome. So far, there is still a lack of computational methods for the multi-omics integration of different types of single-cell and ST tumor sequencing data. Therefore, the aim of this thesis is to extract (allele-specific) CNA signals in single-cell and ST tumor sequencing data, which enables the integration of multi-omics at the subclone level. We achieved this through the development of two methods — Alleloscope (Chapter 2) and Clonalscope (Chapter 3). Alleloscope is a computational method for profiling allele-specific CNAs in single-cell DNA- and/or transposase-accessible chromatin-sequencing (scDNA-seq, ATAC-seq) data, enabling integrative analysis of allele-specific copy number and chromatin accessibility. On scDNA-seq data from gastric, colorectal and breast cancer samples, with validation using matched linked-read sequencing, Alleloscope finds pervasive occurrence of highly complex, multiallelic CNAs, in which cells that carry varying allelic configurations adding to the same total copy number coevolve within a tumor. On scATAC-seq from two basal cell carcinoma samples and a gastric cancer cell line, Alleloscope detected multiallelic copy number events and copy-neutral loss-of-heterozygosity, enabling dissection of the contributions of chromosomal instability and chromatin remodeling to tumor evolution. To detect genetically different subclones based on CNAs, we also developed Clonalscope, a subclone detection method for different single-cell and ST tumor sequencing data, which leverages prior information from matched bulk DNA-seq data. Clonalscope implements a nested Chinese Restaurant Process to model the evolutionary process in tumors. On scRNA-seq and scATAC-seq data from three gastrointestinal tumor samples, Clonalscope successfully labeled malignant cells and identified genetically different subclones, which were validated in detail using matched scDNA-seq data. On ST data from a basal cell carcinoma and two invasive ductal carcinoma samples, Clonalscope was able to label malignant spots, trace subclones between related datasets, and identify spatially segregated subclones expressing genes associated with drug resistance and survival.
Wu, Chi-Yun, "Multi-Omics Integration Through Single-Cell Copy Number Analysis In Cancer" (2022). Publicly Accessible Penn Dissertations. 5589.
Bioinformatics Commons, Biostatistics Commons, Oncology Commons