Visualizing and Querying Large-scale Structured Datasets by Learning Multi-layered 3D Meta-Profiles

2022 IEEE International Conference on Big Data (Big Data)(2022)

引用 1|浏览6
暂无评分
摘要
Data profiling is a "set of statistical data analysis activities to determine properties of a dataset". Historically, it was aimed at data (not meta-data), but at scale, the tables’ meta-data (i.e. title, attribute names, types) becomes abundant, hence its profiling becomes vital, especially in order to understand the contents of large-scale structured datasets.Here we describe and evaluate the algorithms and models behind our scalable Meta-data profiler. It is capable of learning Meta-profiles for a topic of interest in extreme-scale structured datasets, such as WDC [1] or CORD-19 [2] having millions of tables and hundreds of thousands of sources. A 3D Meta-profile visualizes a specific topic (e.g. COVID-19 vaccine side-effects) present in a large-scale structured dataset and simplifies access and comparison for data scientists and end-users.
更多
查看译文
关键词
3d,large-scale,multi-layered,meta-profiles
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要