Back to Search Start Over

Context-Aware Parse Trees

Authors :
Ye, Fangke
Zhou, Shengtian
Venkat, Anand
Marcus, Ryan
Petersen, Paul
Tithi, Jesmin Jahan
Mattson, Tim
Kraska, Tim
Dubey, Pradeep
Sarkar, Vivek
Gottschlich, Justin
Publication Year :
2020

Abstract

The simplified parse tree (SPT) presented in Aroma, a state-of-the-art code recommendation system, is a tree-structured representation used to infer code semantics by capturing program \emph{structure} rather than program \emph{syntax}. This is a departure from the classical abstract syntax tree, which is principally driven by programming language syntax. While we believe a semantics-driven representation is desirable, the specifics of an SPT's construction can impact its performance. We analyze these nuances and present a new tree structure, heavily influenced by Aroma's SPT, called a \emph{context-aware parse tree} (CAPT). CAPT enhances SPT by providing a richer level of semantic representation. Specifically, CAPT provides additional binding support for language-specific techniques for adding semantically-salient features, and language-agnostic techniques for removing syntactically-present but semantically-irrelevant features. Our research quantitatively demonstrates the value of our proposed semantically-salient features, enabling a specific CAPT configuration to be 39\% more accurate than SPT across the 48,610 programs we analyzed.

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2003.11118
Document Type :
Working Paper