Date of Award
Doctor of Philosophy (PhD)
Genomics & Computational Biology
Jason H. Moore
Genome-wide association studies (GWAS) have been extensively critiqued for their perceived inability to adequately elucidate the genetic underpinnings of complex disease. Of particular concern is “missing heritability,” or the difference between the total estimated heritability of a phenotype and that explained by GWAS-identified loci. There are numerous proposed explanations for this missing heritability, but a frequently ignored and potentially vastly informative alternative explanation is the ubiquity of epistasis underlying complex phenotypes.
Given our understanding of how biomolecules interact in networks and pathways, it is not unreasonable to conclude that the effect of variation at individual genetic loci may non-additively depend on and should be analyzed in the context of their interacting partners. It has been recognized for over a century that deviation from expected Mendelian proportions can be explained by the interaction of multiple loci, and the epistatic underpinnings of phenotypes in model organisms have been extensively experimentally quantified. Therefore, the dearth of inspiring single locus GWAS hits for complex human phenotypes (and the inconsistent replication of these between populations) should not be surprising, as one might expect the joint effect of multiple perturbations to interacting partners within a functional biological module to be more important than individual main effects.
Current methods for analyzing data from GWAS are not well-equipped to detect epistasis or replicate significant interactions. The multiple testing burden associated with testing each pairwise interaction quickly becomes nearly insurmountable with increasing numbers of loci. Statistical and machine learning approaches that have worked well for other types of high-dimensional data are appealing and may be useful for detecting epistasis, but potentially require tweaks to function appropriately. Biological knowledge may also be leveraged to guide the search for epistasis candidates, but requires context-appropriate application (as, for example, two loci with significant main effects may not have a significant interaction, and vice versa).
Rather than renouncing GWAS and the wealth of associated data that has been accumulated as a failure, I propose the development of new techniques and incorporation of diverse data sources to analyze GWAS data in an epistasis-centric framework.
Piette, Elizabeth, "Strategies For Improving Epistasis Detection And Replication" (2018). Publicly Accessible Penn Dissertations. 3081.