
Statistics Papers
Document Type
Journal Article
Date of this Version
2011
Publication Source
Journal of the American Statistical Association
Volume
106
Issue
496
Start Page
1566
Last Page
1577
DOI
10.1198/jasa.2011.tm11199
Abstract
This article considers sparse linear discriminant analysis of high-dimensional data. In contrast to the existing methods which are based on separate estimation of the precision matrix Ω and the difference δ of the mean vectors, we introduce a simple and effective classifier by estimating the product Ωδ directly through constrained ℓ1 minimization. The estimator can be implemented efficiently using linear programming and the resulting classifier is called the linear programming discriminant (LPD) rule. The LPD rule is shown to have desirable theoretical and numerical properties. It exploits the approximate sparsity of Ωδ and as a consequence allows cases where it can still perform well even when Ω and/or δ cannot be estimated consistently. Asymptotic properties of the LPD rule are investigated and consistency and rate of convergence results are given. The LPD classifier has superior finite sample performance and significant computational advantages over the existing methods that require separate estimation of Ω and δ. The LPD rule is also applied to analyze real datasets from lung cancer and leukemia studies. The classifier performs favorably in comparison to existing methods.
Copyright/Permission Statement
This is an Accepted Manuscript of an article published by Taylor & Francis in Journal of the American Statistical Association on 24 Jan 2012, available online: http://wwww.tandfonline.com/10.1198/jasa.2011.tm11199.
Keywords
classification, Constrained l1 minimization, Fisher's rule, Naive Bayes rule, sparsity
Recommended Citation
Cai, T., & Liu, W. (2011). A Direct Estimation Approach to Sparse Linear Discriminant Analysis. Journal of the American Statistical Association, 106 (496), 1566-1577. http://dx.doi.org/10.1198/jasa.2011.tm11199
Date Posted: 27 November 2017