Minimax Optimality In High-Dimensional Classification, Clustering, And Privacy

Zhang, Linjun

Minimax Optimality In High-Dimensional Classification, Clustering, And Privacy

Files

Zhang_upenngdas_0175C_13624.pdf (970.79 KB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Statistics

Subject

Classification
Clustering
Differential Privacy
High-dimensional data
Minimax Optimal
Non-convex Optimization
Computer Sciences
Statistics and Probability

Copyright date

2019-08-27T20:19:00-07:00

Permalink

https://repository.upenn.edu/handle/20.500.14332/30280

View all metadata

Author

Zhang, Linjun

Abstract

The age of “Big Data” features large volume of massive and high-dimensional datasets, leading to fast emergence of different algorithms, as well as new concerns such as privacy and fairness. To compare different algorithms with (without) these new constraints, minimax decision theory provides a principled framework to quantify the optimality of algorithms and investigate the fundamental difficulty of statistical problems. Under the framework of minimax theory, this thesis aims to address the following four problems: 1. The first part of this thesis aims to develop an optimality theory for linear discriminant analysis in the high-dimensional setting. In addition, we consider classification with incomplete data under the missing completely at random (MCR) model. 2. In the second part, we study high-dimensional sparse Quadratic Discriminant Analysis (QDA) and aim to establish the optimal convergence rates. 3. In the third part, we study the optimality of high-dimensional clustering on the unsupervised setting under the Gaussian mixtures model. We propose a EM-based procedure with the optimal rate of convergence for the excess mis-clustering error. 4. In the fourth part, we investigate the minimax optimality under the privacy constraint for mean estimation and linear regression models, under both the classical low-dimensional and modern high-dimensional settings.

Advisor

Tony Cai

Date of degree

2019-01-01

Collection

Dissertations and Theses