## Publicly Accessible Penn Dissertations

2017

Dissertation

#### Degree Name

Doctor of Philosophy (PhD)

Statistics

Tony Cai

#### Abstract

High-dimensional linear models play an important role in the analysis of modern data sets. Although the estimation problem has been well understood, there is still a paucity of methods and theories on the inference problem for high-dimensional linear models. This thesis focuses on statistical inference for high-dimensional linear models and consists of the following three parts.

1. The first part of the thesis considers confidence intervals for linear functionals in high-dimensional linear regression. We first establish the convergence rates of the minimax expected length for confidence intervals. Furthermore, we investigate the problem of adaptation to sparsity for the construction of confidence intervals and identify the regimes in which it is possible to construct adaptive confidence intervals.

2. In the second part of the thesis, we consider point and interval estimation of the $\ell_q$ loss of a given estimator in high-dimensional linear regression. For the class of rate-optimal estimators, we establish the minimax rates for estimating their $\ell_{q}$ losses, the minimax expected length of confidence intervals for their $\ell_{q}$ losses and the possibility of adaptivity of confidence intervals for their $\ell_q$ losses.

3. In the third part of the thesis, we consider the problem in the framework of high-dimensional instrumental variable regression and construct confidence intervals for the treatment effect in the presence of possibly invalid instrumental variables. We develop a novel selection procedure, Two-Stage Hard Thresholding (TSHT) to select valid instrumental variables and construct honest confidence intervals for the treatment effect using the selected instrumental variables.

COinS