Measurement Error And Missing Data Methods In Biomarker Research

Loading...
Thumbnail Image
Degree type
Doctor of Philosophy (PhD)
Graduate group
Epidemiology & Biostatistics
Discipline
Subject
competing risks
longitudinal analysis
measurement error
missing data
mixed-effects models
survival analysis
Biostatistics
Funder
Grant number
License
Copyright date
2021-08-31T20:20:00-07:00
Distributor
Related resources
Author
Caswell, Carrie
Contributor
Abstract

Measurement error and missing data are two phenomena which prevent researchers from observing essential quantities in their studies. Measurement error occurs when data are subject to variability which masks an underlying value. Recognition of measurement error is essential to preventing bias in an analysis, and methods to handle it have been well-developed in recent years. However, in time-to-event analyses, competing risks is another important consideration which can invalidate study results if not properly accounted for. Current methods to accommodate competing risks do not account for measurement error, and, as a result, incur a large amount of bias when using covariates measured with error. We first propose a novel method which combines the intuition of the subdistribution model for competing risks with risk set regression calibration, which corrects for measurement error in Cox regression by recalibrating at each failure time. We show through simulations that the proposed estimator removes bias that occurs when measurement error is ignored. The second part of this dissertation addresses missing outcome data in longitudinal models. While this is a well-studied area of research, some current missing data methods are subject to misspecification, while others are not suited to handle a large amount of missing data. We propose a novel method to account for missing longitudinal outcome data in the situation where some patients have no recorded outcomes. We accomplish this through use of an auxiliary outcome available for all patients, and avoid the pitfall of misspecification by estimating its relationship with the data nonparametrically. We show that this method is more efficient than conventional methods and robust to misspecification. For both proposed methods, we show that the estimators are asymptotically normal, and provide consistent variance estimates. We also show that the estimator for the second method is consistent. We apply both proposed methods to neurodegenerative disease data. Finally, we introduce an R package to implement the first proposed method and make it widely available for regular use.

Advisor
Sharon X. Xie
Date of degree
2019-01-01
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation