Towards More Automated Statistical Inference And Machine Learning

Matteo Sordello, University of Pennsylvania


While the amount of data that we are able to collect keeps growing, the use of Machine Learning and statistical models has become more computationally demanding. Moreover, especially in industry applications, these models need to be trained quickly and efficiently, while also being updated frequently. With this increased complexity comes the necessity of finding ways to make the creation and training of models as automated as possible.

In the first part of this thesis, we develop two methods to adaptively tune the learning rate of iterative methods to perform optimization and sampling. While the two settings are substantially different, a similar underlying idea related to stationarity detection of the updates can be used to gain information about the current state of the system.The understanding of when (approximate) stationarity is achieved allows us to decay or increase the learning rate at the appropriate time, creating robust strategies that are not very sensitive to initial misspecification for this crucial parameter. In the second part of this thesis, we move our attention to Active Learning, a useful framework in which the burden of human annotation necessary to create the training data is reduced, since it is the model itself to indicate which training points are more likely to provide a better increase in its performance. Here, we improve the performance of a state-of-the-art method by including focused training in its training routine, allowing the model to select relevant points to use for the optimization phase instead of agnostically looping through all of them.