Balancing Fit And Complexity In Learned Representations

Loading...
Thumbnail Image
Degree type
Doctor of Philosophy (PhD)
Graduate group
Electrical & Systems Engineering
Discipline
Subject
Corrupted data
Resilient learning
RKHS
Sparsity
Statistical learning
Applied Mathematics
Artificial Intelligence and Robotics
Electrical and Electronics
Funder
Grant number
License
Copyright date
2022-09-09T20:21:00-07:00
Distributor
Related resources
Author
Peifer, Maria
Contributor
Abstract

This dissertation is about learning representations of functions while restricting complexity. In machine learning, maximizing the fit and minimizing the complexity are two conflicting objectives. Common approaches to this problem involve solving a regularized empirical minimization problem, with a complexity measure regularizer and a regularizing parameter that controls the trade-off between the two objectives. The regularizing parameter has to be tuned by repeatedly solving the problem and does not have a straightforward interpretation. This work formulates the problem as a minimization of the complexity measure subject to the fit constraints.The issue of complexity is tackled in reproducing kernel Hilbert spaces (RKHSs) by introducing a novel integral representation of a family of RKHSs that allows arbitrarily placed kernels of different widths. The functional estimation problem is then written as a sparse functional problem, which despite being non-convex and infinite-dimensional can be solved in the dual domain. This problem achieves representations of lower complexity than traditional methods because it searches over a family of RKHS rather than a subspace of a single RKHS. The integral representation is used in a federated classification setting, in which a global model is trained from a federation of agents. This is possible because the dual optimal variables give information about the samples that are fundamental to the classification. Each agent, therefore, learns a local model and sends only the fundamental samples over the network. This creates a federated learning method that requires only one network communication. Its solution is proven to asymptotically converges to that of traditional classification. Next, a theory for constraint specification is established. An optimization problem with a constraint for each sample point can easily become infeasible if the constraints are too tight. In contrast, relaxing all constraints can cause the solution to not fit the data well. The constrained specification method relaxes the constraints until the marginal cost of changing a constraint is equal to the marginal complexity measure. This problem is proven to be feasible and solvable and shown empirically to be resilient to outliers and corrupted training data.

Advisor
Alejandro Ribeiro
Date of degree
2021-01-01
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation