ESTIMATION AND PREDICTION PROBLEMS IN MISSING DATA
Degree type
Graduate group
Discipline
Statistics and Probability
Subject
Machine Learning
Missing Data
Nonparametric estimation
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Contributor
Abstract
In recent years, conformal prediction has emerged as a robust methodology for making finite-sample valid, distribution-free predictions, attracting significant attention across various statistical and machine learning domains. This technique is particularly valued for its versatility, being applicable alongside any machine learning algorithm to produce valid prediction regions, with the efficiency of these regions closely tied to the underlying algorithm's performance. Despite the wealth of research on optimizing point predictions through methods such as cross-validation, discussion on selecting the most efficient machine learning algorithm to produce conformal prediction regions is significantly lacking in the literature. We aim to address this gap by introducing selection algorithms designed to minimize the conformal prediction region's width while considering both coverage and efficiency. While classic conformal prediction requires the underlying data which includes both the training data and the test points to predict at (which we will abbreviate as test data) to be exchangeable (sharing the same distribution), we seek to relax this condition allowing for some covariate shift between the training and test data. This can be used to address challenges in many areas, including the missing data literature and causal inference. We reveal and leverage deep connections between modern semiparametric efficiency theory, missing data and causal inference, and emerging methods for in conformal prediction for well-calibrated prediction inference. We propose a novel framework that leverages efficient influence functions, allowing for the adaptive calibration of prediction regions under covariate shifts, akin to the missing at random assumption. This advancement unlocks the potential for more effective prediction intervals without sacrificing coverage accuracy. Indeed we are able to show that our framework attains large sample efficiency and validity for any collection of machine learning techniques and their respective tuning parameters, which is doubly robust in the sense that it only requires the relatively mild requirement that at least one of two estimated nuisance functions is consistent, without necessarily requiring fast convergence rates for the latter. Complementing these developments, we also delve into series regression, a cornerstone of non-parametric regression techniques, by introducing a new estimator inspired by the Forster–Warmuth (FW) learner. This estimator not only relaxes the stringent conditions required by traditional series estimators but also extends the FW learner's utility to a broader array of counterfactual nonparametric regression problems, in which the response variable of interest may not be directly observed on all sampled units. By focusing on a unified pseudo-outcome approach, we offer a comprehensive solution to counterfactual regression, achieving minimax rate optimality under less restrictive conditions and demonstrating its application in missing data and causal inference scenarios. Through these innovations, we aim to bridge gaps in the current literature and introduce tools that promise greater precision and adaptability in statistical predictions and inference.