Date of this Version
Advances in Neural Information Processing Systems
We establish theoretical results concerning all local optima of various regularized M-estimators, where both loss and penalty functions are allowed to be nonconvex. Our results show that as long as the loss function satisfies restricted strong convexity and the penalty function satisfies suitable regularity conditions, any local optimum of the composite objective function lies within statistical precision of the true parameter vector. Our theory covers a broad class of nonconvex objective functions, including corrected versions of the Lasso for errors-in-variables linear models; regression in generalized linear models using nonconvex regularizers such as SCAD and MCP; and graph and inverse covariance matrix estimation. On the optimization side, we show that a simple adaptation of composite gradient descent may be used to compute a global optimum up to the statistical precision εstat in log(1/εstat) iterations, which is the fastest possible rate of any first-order method. We provide a variety of simulations to illustrate the sharpness of our theoretical predictions.
Loh, P., & Wainwright, M. J. (2013). Regularized M-estimators With Nonconvexity: Statistical and Algorithmic Theory for Local Optima. Advances in Neural Information Processing Systems, 26 Retrieved from https://repository.upenn.edu/statistics_papers/222
Date Posted: 27 November 2017