APPROACHES TO MODEL GENETIC AND GENE-BY-ENVIRONMENT INTERACTIONS UNDERLYING COMPLEX TRAITS USING LATENT FEATURES
Degree type
Graduate group
Discipline
Subject
GWAS
machine learning
statistics
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Contributor
Abstract
This thesis develops methods that address the underlying reasons for inefficient SNP discovery in GWAS. After developing methods that remove outliers and simulate Linkage disequilibrium, we have concluded that heterogeneous phenotypes and GxE effects contribute far more to GWAS's inefficiency. Chapter 1 discusses the existing landscape of GWAS and the need for innovative methodologies. Chapters 2 and 3 discuss the REGENS and STAR_outliers python packages, which were developed to examine the effects of LD and outlier removal respectively. As it became apparent that neither of these issues contribute much to GWAS’s statistical inefficiency, we developed TRACE to address clinical phenotype heterogeneity and identify previously unseen gene-environment interactions. Therefore, chapter 4 explains how our novel TRACE algorithm has identified hundreds of novel SNP associations and created new options for associating SNPs with complex phenotypes. Chapter 5 generalizes how TRACE can uncover the genetic architecture for other complex phenotypes and outlines possible methods for rigorous validation.
Advisor
Himes, Blanca, E