Machine Learning in Function Spaces

Seidman, Jacob, Hugh

Machine Learning in Function Spaces

Files

Seidman_upenngdas_0175C_15521.pdf (8.26 MB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Applied Mathematics and Computational Science

Discipline

Mathematics

Subject

Machine Learning
Operator Learning
PDEs

Copyright date

2022

Permalink

https://repository.upenn.edu/handle/20.500.14332/58936

View all metadata

Author

Seidman, Jacob, Hugh

Abstract

Operator learning is an emerging area of machine learning which aims to learn mappings (operators) between functions from data. Many physical systems can be mathematically formulated as giving a relationship between functional data, hence operator learning has the potential to be a transformative tool in applications such as fluid dynamics, solid mechanics, and climate science. While many classical and successful machine learning approaches to regression focus on data in finite dimensional vector spaces, the straightforward application of these methods to discretizations of functional data can be limiting. This motivates the need for resolution invariant methods which are able to learn on the function spaces of data themselves as opposed to their discretizations. In this thesis we further contribute to this line of work with a new operator learning method, LOCA (Learning Operators with Coupled Attention). Inspired by the success of the attention mechanism, LOCA learns operators between function spaces by averaging features of the input function with a probability distribution that depends on the location of the output function query. These distributions are then coupled to each other across the query domain with a kernel integral transformation, allowing the model to learn correlations in the output functions. This has the additional effect of making the model particularly data efficient in terms of the number of available output function measurements for training. The construction of LOCA is accompanied with proofs of universality, demonstrating it is expressive enough to approximate any continuous operator, and demonstrations of state of the art performance on several operator learning benchmarks. Next, some fundamental limitations of operator architectures with linear decoders such as LOCA are discussed and connections are made to known concepts in approximation theory. This leads to a simple lower bound on the approximation error of such architectures and sheds light on when they become less effective, such as modelling advection dominated phenomenon. Finally, we propose the use of nonlinear decoders in operator learning architectures as a necessary modification to avoid the lower bound limiting their performance. This modification is shown to significantly increase the performance of operator learning architectures while simultaneously requiring fewer model parameters. In total, these results make a notable contribution to the field of operator learning and present several interesting future directions for research.

Advisor

Pappas, George, J
Preciado, Victor, M

Date of degree

2022

Collection

Dissertations and Theses