Graph Neural Networks for Interpretable Biomedical Data Analysis in Genomics and Structural Biology

Kumar, Rachit

Graph Neural Networks for Interpretable Biomedical Data Analysis in Genomics and Structural Biology

Degree type

PhD

Graduate group

Genomics and Computational Biology

Discipline

Bioinformatics
Data Science
Chemistry

Subject

Deep learning
Genomics
Graph neural networks
Machine learning
Structural biology

Copyright date

01/01/2025

Permalink

https://repository.upenn.edu/handle/20.500.14332/61329

View all metadata

Author

Kumar, Rachit

Abstract

The quantity and scope of biomedical data has increased dramatically, and with such an increase has come a corresponding increase in the need for scalable methods for integrating such data. Simultaneously, ensuring the interpretability of such methods has similarly become critical to not only improving the prospect of translating such methods to a variety of domains, but also enabling their use in identifying new areas of inquiry and insight. We first provide an overview of the literature and past work of other researchers in two fields: (1) network-based multiomics analysis; and (2) learnable protein representations in structural biology with a focus on drug-target affinity. We find that graph neural networks play a prominent role in both of these paradigms but that many gaps remain in their effective utilization thereof, inspiring us to develop new methods that make use of graph neural networks more effectively in related paradigms. We then show how graph neural networks can be used in genomics data analysis by discussing our development of a graph neural network model for predicting disease from genomics data alone, showcasing this model in a case study of Alzheimer's disease. To show that graph neural networks provide unique benefits, we highlight our model's performance and inherent interpretability in identifying contributors to the development of Alzheimer's disease. We subsequently discuss the development of a graph neural network that predicts drug-target affinity from protein and drug structures, represented as graphs. We further emphasize the value of using graph neural networks in this way by highlighting its utility and interpretability in applying it to several downstream tasks such as predicting protein mutation impact on drug binding and identifying binding residues of drugs as well as highlighting its ability to preferentially rank known drugs that target certain proteins. Our state-of-the-art results by using graph neural networks in these paradigms while maintaining high levels of interpretability demonstrate the potential of using graph neural networks across the biomedical data analysis spectrum, showcasing the inherent interpretability and power that can be gained when using graph-based methods for generating representations of complex biomedical data and analyzing such data.

Advisor

Ritchie, Marylyn, D

Date of degree

2025

Collection

Dissertations and Theses