Date of Award


Degree Type


Degree Name

Doctor of Philosophy (PhD)

Graduate Group

Epidemiology & Biostatistics

First Advisor

Mary E. Putt

Second Advisor

Devan V. Mehrotra


In a crossover clinical trial, including period-specific baselines as covariates in a regression model is known to increase the precision of the estimated treatment effect. The potential efficiency gain depends, in part, on the true model, the distribution and covariance matrix of the vector of baselines and outcomes, and the model chosen for analysis. We examine improvements in power that can be achieved by incorporating optimal linear combination of baselines (LCB). For a known distribution, the optimal LCB minimizes the conditional variance corresponding to a treatment effect. The use of a single metric to capture the information in the baseline measurements is appealing for crossover designs. Because of their efficiency, crossover designs tend to have small sample sizes and thus the number of covariates in a model can significantly impact the degrees of freedom in the analysis. We start by examining optimal LCB models under a normality assumption for uniform and incomplete block designs. For uniform designs, such as the AB/BA design, estimation is entirely through within-subject contrasts (and thus ordinary least squares [OLS]) and the optimal LCB minimizes the conditional variance corresponding to the treatment effect. However, since the optimal LCB is a function of the unknown covariance matrix, we propose an adaptive method that uses the LCB covariate corresponding to the most plausible covariance structure guided by the data. For incomplete block designs, data are commonly analyzed using a mixed effects model. Treatment effect estimates from this analysis are complex functions of both within-subject and between-subject treatment contrasts. To improve efficiency, we propose incorporating period-specific optimal LCBs which minimize the conditional variance of the period-specific outcomes. A simpler fixed effects analysis of covariance involving only within-subject contrasts is also described for small sample situations. In the latter, hypothesis tests based on the mixed effects analyses exhibit inflated type I error rates even when using a Kenward and Rogers approach to adjust the degrees of freedom. Lastly, we extend this work to the more general setting where the optimal LCB depends on the distribution of the response vector. In practice, the distribution is unknown and the optimal LCB is estimated under some loss function. To handle both normal and non-normal response data, OLS and a rank-based nonparametric regression model (R-estimation), are considered. A data-driven approach is then proposed which adaptively chooses the best fitting model among a set of models which work well under a range of conditions. Relative to commonly used methods, such as change from baseline analyses without use of covariates, our methods using functions of baselines as period-specific or period-invariant covariates consistently demonstrate improved power across a number of crossover designs, covariance structures, and response distributions.

Included in

Biostatistics Commons