NEURAL INFERENCE OF PROGRAM SPECIFICATIONS

Loading...
Thumbnail Image
Degree type
Doctor of Philosophy (PhD)
Graduate group
Computer and Information Science
Discipline
Computer Sciences
Subject
Machine Learning
Programming Languages
Software Engineering
Funder
Grant number
License
Copyright date
2023
Distributor
Related resources
Author
Dinella, Elizabeth, Ann
Contributor
Abstract

Ensuring program correctness is a fundamental goal in the field of software engineering. Reliable functioning of computer programs is increasingly essential in today's digital world. However, productively writing correct code for complex systems remains a significant challenge. Decades of active research in program reasoning yielded many fruitful techniques grounded in rules and formal logic. Such symbolic techniques have achieved considerable successes, but they do come with some noteworthy limitations. Firstly, many techniques require a correctness property to check against. Without an explicitly provided specification these approaches fail to perform any level of reasoning. Secondly, many techniques struggle to scale in the presence of constructs widely seen in real world programs. This dissertation aims to address these challenges by inferring program specifications through statistical patterns from a large corpus of data.Recently, deep learning techniques have achieved remarkable breakthroughs in many domains including natural language processing and image generation. Inspired by these advances, this dissertation applies techniques from the field of deep learning to program reasoning. A data driven paradigm of program specification inference is appealing as it automatically provides a definition of program correctness for a reasoning tool to check against. Furthermore, such a system can be invoked in a quick and lightweight query. This dissertation demonstrates the promise of deep learning based specification inference techniques in a variety of program reasoning tasks including static bug finding, merge conflict resolution, and automated testing. For each domain, it provides datasets, methodologies, neural techniques, and comparative evaluations. It also includes a detailed analysis of the tradeoffs between data driven techniques and traditional symbolic methods as well as the benefits of combining these techniques. It concludes with a summary of the insights gained and lessons learned, offering guidance for the application of specification inference to additional program reasoning domains.

Advisor
Naik, Mayur
Date of degree
2023
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation