Machine programming (MP) is a new field of research that uses automation to improve software development productivity (e.g., the time it takes a developer to write code) and quality (e.g., performance, correctness, security, maintainability, etc.). We generally consider MP as a fusion of machine learning and formal methods, which rely heavily on programming languages and systems.
PublicationMISIM: A Novel Code Similarity System(2020-06-01) Ye, Fangke; Zhou, Shengtian; Venkat, Anand; Marcus, Ryan; Tatbul, Nesime; Tithi, Jesmin J; Hasabnis, Niranjan; Petersen, Paul; Mattson, Timothy; Kraska, Tim; Dubey, Pradeep; Gottschlich, Justin E; Gottschlich, Justin ECode similarity systems are integral to a range of applications from code recommendation to automated software defect correction. We argue that code similarity is now a first-order problem that must be solved. To begin to address this, we present machine Inferred Code Similarity (MISIM), a novel end-to-end code similarity system that consists of two core components. First, MISIM uses a novel context-aware semantic structure, which is designed to aid in lifting semantic meaning from code syntax. Second, MISIM provides a neural-based code similarity scoring algorithm, which can be implemented with various neural network architectures with learned parameters. We compare MISIM to three state-of-the-art code similarity systems: (i) code2vec, (ii) Neural Code Comprehension, and (iii) Aroma. In our experimental evaluation across 328,155 programs (over 18 million lines of code), MISIM has 1.5x to 43.4x better accuracy than all three systems. PublicationThe Three Pillars of Machine Programming(2018-01-01) Gottschlich, Justin E; Gottschlich, Justin E; Solar-Lezama, Armando; Tatbul, Nesime; Carbin, Michael; Rinard, Martin; Barzilay, Regina; Amarasinghe, Saman; Tenenbaum, Joshua B; Mattson, TimothyIn this position paper, we describe our vision of the future of machine programming through a categorical examination of three pillars of research. Those pillars are:(i) intention,(ii) invention, and (iii) adaptation. Intention emphasizes advancements in the human-to-computer and computer-to-machine-learning interfaces. Invention emphasizes the creation or refinement of algorithms or core hardware and software building blocks through machine learning (ML). Adaptation emphasizes advances in the use of ML-based constructs to autonomously evolve software. PublicationA Zero-Positive Learning Approach for Diagnosing Software Performance Regressions(2019-01-01) Gottschlich, Justin E; Gottschlich, Justin E; Tatbul, Nesime; Turek, Javier S; Mattson, Timothy; Muzahid, AbdullahThe field of machine programming (MP), the automation of the development of software, is making notable research advances. This is, in part, due to the emergence of a wide range of novel techniques in machine learning. In this paper, we apply MP to the automation of software performance regression testing. A performance regression is a software performance degradation caused by a code change. We present AutoPerf–a novel approach to automate regression testing that utilizes three core techniques:(i) zero-positive learning,(ii) autoencoders, and (iii) hardware telemetry. We demonstrate AutoPerf’s generality and efficacy against 3 types of performance regressions across 10 real performance bugs in 7 benchmark and open-source programs. On average, AutoPerf exhibits 4% profiling overhead and accurately diagnoses more performance bugs than prior state-of-the-art approaches. Thus far, AutoPerf has produced no false negatives.