NL2Formula: Generating Spreadsheet Formulas from Natural Language Queries

arXiv (Cornell University)(2024)

引用 0|浏览62
暂无评分
摘要
Writing formulas on spreadsheets, such as Microsoft Excel and Google Sheets,is a widespread practice among users performing data analysis. However,crafting formulas on spreadsheets remains a tedious and error-prone task formany end-users, particularly when dealing with complex operations. To alleviatethe burden associated with writing spreadsheet formulas, this paper introducesa novel benchmark task called NL2Formula, with the aim to generate executableformulas that are grounded on a spreadsheet table, given a Natural Language(NL) query as input. To accomplish this, we construct a comprehensive datasetconsisting of 70,799 paired NL queries and corresponding spreadsheet formulas,covering 21,670 tables and 37 types of formula functions. We realize theNL2Formula task by providing a sequence-to-sequence baseline implementationcalled fCoder. Experimental results validate the effectiveness of fCoder,demonstrating its superior performance compared to the baseline models.Furthermore, we also compare fCoder with an initial GPT-3.5 model (i.e.,text-davinci-003). Lastly, through in-depth error analysis, we identifypotential challenges in the NL2Formula task and advocate for furtherinvestigation.
更多
查看译文
关键词
Spreadsheets,Spreadsheet Errors
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要