Cancer Absolute Risk Projection with Incomplete Predictor Variables

Chen, Lu

Cancer Absolute Risk Projection with Incomplete Predictor Variables

Files

Chen_upenngdas_0175C_11697.pdf (1.57 MB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Epidemiology & Biostatistics

Subject

Absolute risk prediction
Breast cancer
Predictive accuracy
Semi-parametric maximum likelihood
Stratified case-control study
Two phase design
Biostatistics

Copyright date

2015-07-20T00:00:00-07:00

Permalink

https://repository.upenn.edu/handle/20.500.14332/27797

View all metadata

Author

Chen, Lu

Abstract

A popular approach to projecting cancer absolute risk is to integrate a relative hazard function of predictors with hazard rates obtained from different sources, where the relative hazard function is often approximated by an odds ratio function. To assess added values of candidate risk predictors, it is very common that data for standard risk predictors is fully available from a frequency-matched case-control study, but that of candidate predictors is available only for a subset of cases and controls. In the first project, we developed statistical measures for quantifying predictive accuracy of cancer absolute risk prediction models, accommodating incomplete predictor variables. We particularly focused on a measure that is useful for evaluating efficiency of model-based cancer screening, the proportion of cases that can be captured by screening only people with high projected risk. In the second project, using a logistic regression model to describe the relationship between cancer status and risk predictors, we developed a novel semiparametric maximum likelihood approach that accommodates incomplete predictor data under rare disease approximation for the estimation of odds ratio parameters and the distribution of candidate predictors. Through theoretical and simulation studies, we showed that our estimator is consistent with an asymptotically normal distribution and has improved statistical efficiency. In the third project, we applied the statistical methods developed in the first two to evaluate the added values of percent mammographic density and breast cancer risk SNPs in breast cancer absolute risk projection. Our results showed that the two sets of predictors had similar added values and can lead to more efficient model-based screening for breast cancer. In the fourth project, we applied the semiparametric maximum likelihood method to a family-supplemented study design that we proposed to address survival bias in case-control genetic association studies.

Advisor

Jinbo Chen

Date of degree

2015-01-01

Collection

Dissertations and Theses