Foundation Reward Models for Robot Learning

Loading...
Thumbnail Image

Degree type

Doctor of Philosophy (PhD)

Graduate group

Computer and Information Science

Discipline

Computer Sciences

Subject

Foundation Models
Reinforcement Learning
Robot Learning

Funder

Grant number

License

Copyright date

2025

Distributor

Related resources

Contributor

Abstract

Learning-based algorithms for robotic control have achieved remarkable success in recent years. However, a fundamental bottleneck in existing approaches is their heavy reliance on human supervision, whether through expert demonstrations in imitation learning or carefully engineered reward functions in reinforcement learning. This dependency limits scalability, as it is infeasible to collect demonstrations or design reward functions for every possible task and environment. This thesis proposes a viable path towards scaling robotics by introducing foundation reward models - models capable of generating dense reward labels for robot state, sensory observations, and actions across a wide range of tasks and embodiments. With foundation reward models, robots can train on more diverse, mixed-quality data and and learn from data that they gathered themselves, bypassing the bottleneck of human supervision. However, the key technical challenge is the lack of available robot data to train such models to generalize. In addressing this challenge, we present two classes of approaches to train foundation reward models that can be trained entirely without robot data: (1) a novel offline reinforcement learning algorithm that learns goal-conditioned value functions from unstructured human videos, enabling zero-shot reward generation for unseen robot tasks specified in image or language modalities, and (2) a framework combining large language models with search to automatically design programmatic reward functions for robot simulation environments, enabling sim-to-real transfer of novel skills such as a quadruped robot dog balancing on a yoga ball.

Date of degree

2025

Date Range for Data Collection (Start Date)

Date Range for Data Collection (End Date)

Digital Object Identifier

Series name and number

Volume number

Issue number

Publisher

Publisher DOI

Journal Issues

Comments

Recommended citation