NEURAL INFERENCE OF PROGRAM SPECIFICATIONS

Dinella, Elizabeth, Ann

NEURAL INFERENCE OF PROGRAM SPECIFICATIONS

Files

Dinella_upenngdas_0175C_16095.pdf (3.39 MB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Computer and Information Science

Discipline

Computer Sciences

Subject

Machine Learning
Programming Languages
Software Engineering

Copyright date

2023

Permalink

https://repository.upenn.edu/handle/20.500.14332/59440

View all metadata

Author

Dinella, Elizabeth, Ann

Abstract

Ensuring program correctness is a fundamental goal in the field of software engineering. Reliable functioning of computer programs is increasingly essential in today's digital world. However, productively writing correct code for complex systems remains a significant challenge. Decades of active research in program reasoning yielded many fruitful techniques grounded in rules and formal logic. Such symbolic techniques have achieved considerable successes, but they do come with some noteworthy limitations. Firstly, many techniques require a correctness property to check against. Without an explicitly provided specification these approaches fail to perform any level of reasoning. Secondly, many techniques struggle to scale in the presence of constructs widely seen in real world programs. This dissertation aims to address these challenges by inferring program specifications through statistical patterns from a large corpus of data.Recently, deep learning techniques have achieved remarkable breakthroughs in many domains including natural language processing and image generation. Inspired by these advances, this dissertation applies techniques from the field of deep learning to program reasoning. A data driven paradigm of program specification inference is appealing as it automatically provides a definition of program correctness for a reasoning tool to check against. Furthermore, such a system can be invoked in a quick and lightweight query. This dissertation demonstrates the promise of deep learning based specification inference techniques in a variety of program reasoning tasks including static bug finding, merge conflict resolution, and automated testing. For each domain, it provides datasets, methodologies, neural techniques, and comparative evaluations. It also includes a detailed analysis of the tradeoffs between data driven techniques and traditional symbolic methods as well as the benefits of combining these techniques. It concludes with a summary of the insights gained and lessons learned, offering guidance for the application of specification inference to additional program reasoning domains.

Advisor

Naik, Mayur

Date of degree

2023

Collection

Dissertations and Theses