Johnson, Kory

Email Address
ORCID
Disciplines
Research Projects
Organizational Units
Position
Introduction
Research Interests

Search Results

Now showing 1 - 1 of 1
  • Publication
    Discrete Methods in Statistics: Feature Selection and Fairness-Aware Data Mining
    (2016-01-01) Johnson, Kory
    This dissertation is a detailed investigation of issues that arise in models that change discretely. Models are often constructed by either including or excluding features based on some criteria. These discrete changes are challenging to analyze due to correlation between features. Feature selection is the problem of identifying an appropriate set of features to include in a model, while fairness-aware data mining is the problem of needing to remove the \emph{influence} of protected features from a model. This dissertation provides frameworks for understanding each problem and algorithms for accomplishing the desired goal. The feature selection problem is addressed through the framework of sequential hypothesis testing. We elucidate the statistical challenges in repeatedly using inference in this domain and demonstrate how current methods fail to address them. Our algorithms build on classically motivated, multiple testing procedures to control measures of false rejections when using hypothesis testing during forward stepwise regression. Furthermore, these methods have much higher power than recent proposals from the conditional inference literature. The fairness-aware data mining community is grappling with fundamental questions concerning fairness in statistical modeling. Tension exists between identifying explainable differences between groups and discriminatory ones. We provide a framework for understanding the connections between fairness and the use of protected information in modeling. With this discussion in hand, generating fair estimates is straight-forward.