LinCQA: Faster Consistent Query Answering with Linear Time Guarantees

arxiv(2022)

引用 0|浏览17
暂无评分
摘要
Most data analytical pipelines often encounter the problem of querying inconsistent data that violate pre-determined integrity constraints. Data cleaning is an extensively studied paradigm that singles out a consistent repair of the inconsistent data. Consistent query answering (CQA) is an alternative approach to data cleaning that asks for all tuples guaranteed to be returned by a given query on all (in most cases, exponentially many) repairs of the inconsistent data. This paper identifies a class of acyclic select-project-join (SPJ) queries for which CQA can be solved via SQL rewriting with a linear time guarantee. Our rewriting method can be viewed as a generalization of Yannakakis's algorithm for acyclic joins to the inconsistent setting. We present LinCQA, a system that can output rewritings in both SQL and non-recursive Datalog rules for every query in this class. We show that LinCQA often outperforms the existing CQA systems on both synthetic and real-world workloads, and in some cases, by orders of magnitude.
更多
查看译文
关键词
faster consistent query,linear time guarantees
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要