Back to Search Start Over

NL2Formula: Generating Spreadsheet Formulas from Natural Language Queries

Authors :
Zhao, Wei
Hou, Zhitao
Wu, Siyuan
Gao, Yan
Dong, Haoyu
Wan, Yao
Zhang, Hongyu
Sui, Yulei
Zhang, Haidong
Publication Year :
2024

Abstract

Writing formulas on spreadsheets, such as Microsoft Excel and Google Sheets, is a widespread practice among users performing data analysis. However, crafting formulas on spreadsheets remains a tedious and error-prone task for many end-users, particularly when dealing with complex operations. To alleviate the burden associated with writing spreadsheet formulas, this paper introduces a novel benchmark task called NL2Formula, with the aim to generate executable formulas that are grounded on a spreadsheet table, given a Natural Language (NL) query as input. To accomplish this, we construct a comprehensive dataset consisting of 70,799 paired NL queries and corresponding spreadsheet formulas, covering 21,670 tables and 37 types of formula functions. We realize the NL2Formula task by providing a sequence-to-sequence baseline implementation called fCoder. Experimental results validate the effectiveness of fCoder, demonstrating its superior performance compared to the baseline models. Furthermore, we also compare fCoder with an initial GPT-3.5 model (i.e., text-davinci-003). Lastly, through in-depth error analysis, we identify potential challenges in the NL2Formula task and advocate for further investigation.<br />Comment: To appear at EACL 2024

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2402.14853
Document Type :
Working Paper