Statistics Papers

Document Type

Journal Article

Date of this Version


Publication Source

Journal of Machine Learning Research



Start Page


Last Page



We study some stability properties of algorithms which minimize (or almost-minimize) empirical error over Donsker classes of functions. We show that, as the number n of samples grows, the L2- diameter of the set of almost-minimizers of empirical error with tolerance x(n)=o(n-1/2 ) converges to zero in probability. Hence, even in the case of multiple minimizers of expected error, as n increases it becomes less and less likely that adding a sample (or a number of samples) to the training set will result in a large jump to a new hypothesis. Moreover, under some assumptions on the entropy of the class, along with an assumption of Komlos-Major-Tusnady type, we derive a power rate of decay for the diameter of almost-minimizers. This rate, through an application of a uniform ratio limit inequality, is shown to govern the closeness of the expected errors of the almost-minimizers. In fact, under the above assumptions, the expected errors of almost-minimizers become closer with a rate strictly faster than n-1/2.


At the time of publication, author Alexander Rakhlin was affiliated with Massachusetts Institute of Technology. Currently, he is a faculty member at the Statistics Department at the University of Pennsylvania.


empirical risk minimization, empirical processes, stability, Donsker classes



Date Posted: 27 November 2017

This document has been peer reviewed.