COMPOSITIONAL GENERALIZATION IN INSTRUCTION FOLLOWING TASKS

Dan, Soham

COMPOSITIONAL GENERALIZATION IN INSTRUCTION FOLLOWING TASKS

Files

Dan_upenngdas_0175C_15538.pdf (17.79 MB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Computer and Information Science

Discipline

Computer Sciences

Subject

Artificial Intelligence
Machine Learning
Natural Language Processing

Copyright date

2022

Permalink

https://repository.upenn.edu/handle/20.500.14332/59692

View all metadata

Author

Dan, Soham

Abstract

Understanding instructions expressed in natural language is a fundamental task in artificialintelligence. A key feature of natural language that humans use while giving and following instructions is compositionality: the capacity to understand and produce a potentially infinite number of novel combinations from familiar components. This ability is instrumental in being able to learn from limited data and is crucial for instruction following robots to function in the real-world. This dissertation studies the compositional generalization abilities of machine learning models in various instruction following tasks. We study the various dimensions of compositionality for a diverse set of instruction follow- ing tasks of varying complexity: semantic parsing in synthetic languages, natural language instruction following in blocks world and vision-and-language navigation in complex indoor environments. We demonstrate empirically that existing systems for these tasks, while performant on the standard iid test set requiring mere interpolation, do not compositionally generalize, which requires extrapolation. We then present different strategies to induce compositionality, ranging from data augmentation, to auxiliary tasks, to a simple neuro-symbolic algorithm. We present a compositional spatial representation language and discuss how using such a rich symbolic representation as auxiliary supervision can help generalization in complex, real-world, multi-modal instruction following tasks. Finally, we aim to develop a more foundational understanding of robust generalization by focusing on the task of learning regular languages, where we study the benefits of compositional models over end-to-end ones, from both theoretical and empirical perspectives.

Advisor

Roth, Dan

Date of degree

2022

Collection

Dissertations and Theses