Datetime Feature Recommendation Using Textual Information

Satoshi Masuda, Takaaki Tateishi, Toshihiro Takahashi

International Conference on Knowledge-Based Intelligent Information & Engineering Systems(2023)

引用 0|浏览0
暂无评分
摘要
Analysis to gain new knowledge from huge amounts of data is called data science, and its widespread use is now socially important. Feature engineering, the process of extracting features from data, is one of the main tasks in data science, and since this task relies on the experience of experts, research is being conducted to automate it. In this paper, we propose a novel approach to automate feature identification from textual information in data column names. Specifically, we use techniques of natural language processing and source code analysis, for data descriptions and source codes in Python notebooks to create a knowledge database with a particular focus on datetime features. We develop a recommendation system of datetime features for newly given text information based on that knowledge database. In experiments, we confirmed the classification accuracy of the knowledge database, applied the database to actual forecasting tasks such as home price forecasting, and achieved 5.76% on average for accuracy gain.
更多
查看译文
关键词
Type your keywords here,separated by semicolons
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要