Deep Lifelong Learning with Factorized Knowledge Transfer

Lee, Seungwon

Deep Lifelong Learning with Factorized Knowledge Transfer

Files

Lee_upenngdas_0175C_16294.pdf (9.66 MB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Computer and Information Science

Discipline

Data Science

Subject

Deep Learning
Lifelong Machine Learning
Machine Learning

Copyright date

01/01/2024

Permalink

https://repository.upenn.edu/handle/20.500.14332/60107

View all metadata

Author

Lee, Seungwon

Abstract

Human intelligence has the ability to capture abstract concepts from experience and utilize that learned knowledge for adaptation to new situations. Lifelong machine learning aims to achieve those same properties of human intelligence by designing algorithms to learn from a sequence of tasks, extract useful knowledge of previous tasks, and re-use the extracted knowledge to learn new future tasks. Research into lifelong learning has explored various methodologies, including techniques for sharing knowledge across tasks, techniques for maintaining previously acquired skills, and techniques for actively selecting the next task to learn. This dissertation will focus on one theme of lifelong learning: the way knowledge is transferred across tasks via factorization, which breaks down the architecture of neural networks to naturally encode conceptual knowledge. The tensor factorization is capable of discovering abstract but generalizable knowledge from experiences. This dissertation investigates methods to factorize knowledge encoded in neural networks and share the knowledge across multiple tasks, as well as methods to enhance the training of these factorized knowledge transfer mechanisms. This dissertation starts by developing a lifelong learning architecture that utilizes deconvolutional operation to preserve multi-axis features of data. This deconvolution-based factorization architecture empirically shows reduced harmful interference between tasks thanks to sharing abstract knowledge via factorization. The dissertation then studies the importance of transferring the proper level of knowledge in the network for the success of lifelong learning. As a result, an expectation-maximization style algorithm is developed to discover the useful granularity of knowledge to share for each task depending on the given data. This algorithm determines which layers to share while learning tasks in parallel and reduces human intervention in selecting the knowledge transfer architecture for lifelong learning, which is critical for realistic scenarios with complex task relationships. Moreover, it applies to diverse lifelong learning architectures, augmenting existing lifelong learning works. Lastly, the dissertation investigates the use of data programming to extend existing lifelong learning algorithms into semi-supervised settings, tackling the lifelong learning challenge of data annotation. Due to the modularized framework and theoretical guarantees on the quality of generated labels, this framework can be applied to the existing supervised lifelong learning algorithms.

Advisor

Eaton, Eric

Date of degree

2024

Collection

Dissertations and Theses