Learning To Compositionally Reason Over Natural Language

Gupta, Nitish

Learning To Compositionally Reason Over Natural Language

Files

Gupta_upenngdas_0175C_14608.pdf (7.61 MB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Computer and Information Science

Subject

compositionality
machine learning
natural language processing
question answering
reasoning
Artificial Intelligence and Robotics
Computer Sciences

Copyright date

2021-08-31T20:21:00-07:00

Permalink

https://repository.upenn.edu/handle/20.500.14332/31119

View all metadata

Author

Gupta, Nitish

Abstract

The human ability to understand the world in terms of reusable ``building blocks'' allows us to generalize in near-infinite ways. Developing language understanding systems that can compositionally reason in a similar manner is crucial to achieve human-like capabilities. Designing such systems presents key challenges in the architectural design of machine learning models and the learning paradigm used to train them. This dissertation addresses aspects of both of these challenges by exploring compositional structured models that can be trained using end-task supervision. We believe that solving complex problems in a generalizable manner requires decomposition into sub-tasks, which in turn are solved using reasoning capabilities that can be reused in novel contexts. Motivated by this idea, we develop a neuro-symbolic model with a modular architecture for language understanding and focus on answering questions requiring multi-step reasoning against natural language text. We design an inventory of freely-composable, learnable neural modules for performing various atomic language understanding and symbolic reasoning tasks in a differentiable manner. The question guides how these modules are dynamically composed to yield an end-to-end differentiable model that performs compositional reasoning and can be trained using end-task supervision. However, we show that when trained using such supervision, having a compositional model structure is not sufficient to induce the intended problem decomposition in terms of the modules; Lack of supervision for the sub-tasks leads to modules that do not freely compose in novel ways, hurting generalization. To address this, we develop a new training paradigm that leverages paired examples---instances that share sub-tasks---to provide an additional training signal to that provided by individual examples. We show that this paradigm induces the intended compositional reasoning and leads to improved in- and out-of-distribution generalization.

Advisor

Dan Roth

Date of degree

2021-01-01

Collection

Dissertations and Theses