LEVERAGING PATIENT BIOBANKS TO EXPLORE UNDERSTUDIED GENETIC VARIATIONS AND AI-DERIVED IMAGING TRAITS
Degree type
Graduate group
Discipline
Subject
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Contributor
Abstract
The utilization of integrated clinical and genetic data available within a patient biobank has changed human genetics research and led to numerous instrumental and foundational discoveries. However, a persistent and consequential concern is the lack of adequate genetic diversity within these biobanks, which leads to research findings that are limited and not generalizable in their downstream applications. We propose a statistical genome-first approach to identifying and targeting protein-altering variants significantly enriched in individuals genetically similar to African reference populations relative to European reference populations and perform agnostic phenotype association analyses in the Penn Medicine Biobank (PMBB). We find that this framework not only captures well-known positive controls, such as variants in APOL1 associated with kidney disease risk and in PCSK9 associated with lowered LDL levels, but also novel and replicated associations for a variant in UNC45B associated with hearing loss risk and another in HSPA2 associated with influenza infection. Our work continues to leverage AI-based approaches to extract imaging-derived phenotypes from patient CT studies in the PMBB. We demonstrate the rich and granular clinical details that are captured by these imaging phenotypes by recapitulating known biological trends with age, sex, height, and weight, and also correlating them with disease diagnoses. In addition, we show that our imaging traits are associated with future disease diagnoses, such as increased heart volume with future cardiovascular disease and decreased kidney attenuation with future risk of crystal arthropathies. Integrating our imaging traits with patient genetic data revealed relationships between genetic loci in SOAT1 with adrenal gland volume and SH2B3 with spleen volume. In summary, our diversity-focused genome-first approach and our AI-driven image phenotyping highlight the powerful discovery potential of a patient biobank and its capacity to address pervasive challenges in genetics research and advance our understanding of human complex traits and diseases.
Advisor
Ritchie, Marylyn, D