Congressional AI: A Framework for Task Generalization and Alignment with Expert Language Modes

Loading...
Thumbnail Image

Degree type

Discipline

Business

Subject

Artifical Intelligence
Language Models

Funder

Grant number

Copyright date

2024

Distributor

Related resources

Contributor

Abstract

As foundation models have facilitated rapid adaptation to downstream tasks, a challenge remains in efficiently and flexibly improving their instruction-following capabilities and alignment to human preference distributions. We propose a novel modular architecture, Congressional AI, consisting of parallel trained "experts", such that the top-k relevant experts can be activated during inference. These experts are obtained by fine-tuning LoRA adapters on interpretable data mixtures; for instruction-tuning, each dataset corresponds to a task cluster, while for preference alignment to improve steerability, each dataset represents a group or persona. Our experiments show that instruction-tuning with Congressional AI through low-rank adapter merging is effective via evaluation of cluster-specific adapters across various domains on the MMLU benchmark. These findings demonstrate that Congressional AI is a hardware-efficient and interpretable mixture-of-experts (MoE)-style framework for adapting language models to new tasks and domains, and can be used to further improve both pre-trained and fine-tuned LLMs.

Advisor

Date Range for Data Collection (Start Date)

Date Range for Data Collection (End Date)

Digital Object Identifier

Series name and number

Publication date

2024-04-08

Volume number

Issue number

Publisher

Publisher DOI

Journal Issues

Comments

Recommended citation

Collection