Date of Award


Degree Type


Degree Name

Doctor of Philosophy (PhD)

Graduate Group

Genomics & Computational Biology

First Advisor

Marylyn D. Ritchie


Pleiotropy is a phenomenon which describes a gene or a genetic variant that affects more than one phenotype. This fundamental concept has been thought to play a critical role in genetics, medicine, evolutionary biology, molecular biology, and clinical research. With the recent development in sequencing technologies and statistical methods, pleiotropy can be characterized systematically in human genome. Circulatory system diseases and nervous system disorders have a significant impact on mortality rates worldwide and frequently co-occur in patients. Thus, the field would benefit greatly from the knowledge of the underlying genetic relationship between multiple diseases in these disease categories. In this dissertation, we aim to identify pleiotropy across a wide range of circulatory system diseases and nervous system disorders using large-scale electronic health record-linked biobank datasets. For common genetic variants, we applied an ensemble of methods including univariate, multivariate, and sequential multivariate association methods to characterize pleiotropy in the UK Biobank and the eMERGE network. Our results implicated five pleiotropic regions that help to explain the disease relationships across these disease categories. For rare variants, we performed univariate burden and dispersion tests using whole-exome sequencing data from the UK Biobank and characterized 143 Bonferroni significant pleiotropic genes. Our analytical framework on both common and rare genetic variants offer novel insights into biology and provide a new perspective for studying pleiotropy in large-scale biobank datasets. Besides the application of statistical methods on natural biomedical datasets, we also conducted simulation projects investigating the impact of sample size imbalance on the performance of the proposed statistical methods. Our simulation results can serve as a reference guideline to assist sample size design for association studies.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Additional Files

Appendix A.xlsx (235 kB)
Appendix B.pdf (3752 kB)
Appendix C.pdf (551 kB)
Appendix D.txt (115 kB)
Appendix E.xlsx (13 kB)