Machine Learning For The Diagnosis Of Lung Disease

William Lindsay, University of Pennsylvania


Lung cancer and interstitial lung disease (ILD) are two diseases of the lung with high impact on society. However, despite widespread interest from the scientific, medical, and patient communities, machine learning methods to improve diagnosis of these diseases remain underutilized in the clinical setting. In the face of numerous barriers to adoption, solutions which leverage the right datasets and the right methods to answer relevant clinical questions are needed. Our laboratory engages in clinical collaborations to create such datasets and applies interpretable machine learning methods to train diagnosis models to answer specific clinical questions. In this thesis, we present a machine learning based diagnosis classifier for suspicious thoracic lesions, capable of differentiating between primary and secondary cancer, that is trained on a built-for-purpose pathology confirmed lung nodule dataset. We also present a diagnosis classifier for ILD capable of creating a differential diagnosis list from human extracted image features with higher accuracy than radiologists alone. Together, our work represents an important step towards translating machine learning analyses from the research setting into the clinical domain, though additional validation with multi-institutional datasets will be required before widespread adoption.