Categorical Foundations Of First-Order Abstract Syntax

Dunn, Lawrence, Henry

Categorical Foundations Of First-Order Abstract Syntax

Files

Dunn_upenngdas_0175C_286/tealeaves-v1.2.0 (5.06 MB)

Dunn_upenngdas_0175C_17033.pdf (1.83 MB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Computer and Information Science

Discipline

Computer Sciences

Subject

coq
formal verification
metatheory
monads
programming languages
syntax

Copyright date

2025

Permalink

https://repository.upenn.edu/handle/20.500.14332/61387

View all metadata

Author

Dunn, Lawrence, Henry

Abstract

The representation of syntax is central to programming languages and formal verification. Yet existing approaches abstract away from the first-order nominal treatment of syntax found in textbooks, where substitution requires systematic renaming of bound variables to avoid variable capture. Although this representation closely mirrors what compilers actually implement, it remains notoriously difficult to formalize in proof assistants. Moreover, translating to a more convenient internal representation does not eliminate the need to reason about nominal variable binding; it merely shifts the burden to verifying the correctness of the translation. A rigorous mathematical treatment of concrete first-order representations of syntax is therefore required. This work introduces decorated traversable monads (DTMs) as a foundation for extrinsically-scoped, extrinsically-typed first-order abstract syntax. DTMs unify several classical and lesser-known categorical structures under a novel set of coherence laws relating them to each other. The result can be characterized from three theoretically equivalent perspectives, each offering distinct practical advantages. The accompanying Rocq library, Tealeaves, formalizes these results and establishes their practical applicability. Tealeaves faithfully reproduces the functionality of other tools used with Rocq, while extending them with new capabilities. Tealeaves naturally supports variadic and mutually-recursive binders, and it provides certified translations between different representations of variables within a unified framework. Tealeaves provides the first datatype-generic formalization of alpha-equivalence that both (i) matches its conventional definition and (ii) comes with an executable proof that alpha-equivalence classes correspond bijectively to well-formed locally nameless terms, which in turn correspond to de Bruijn terms in a well-formed environment. This result enables new forms of modular, reusable, and mechanically-verified reasoning about capture-avoiding substitution, helping bridge a critical gap in the formal verification of programming languages and their implementations.

Advisor

Tannen, Val, B
Zdancewic, Steve, A

Date of degree

2025

Collection

Dissertations and Theses