Chrome Extension
WeChat Mini Program
Use on ChatGLM

TANE: an Efficient Algorithm for Discovering Functional and Approximate Dependencies

Computer Journal(1999)CCF BSCI 4区

Univ Helsinki

Cited 768|Views78
Abstract
The discovery of functional dependencies from relations is an important database analysis technique. We present TANE, an efficient algorithm for finding functional dependencies from large databases. TANE is based on partitioning the set of rows with respect to their attribute values, which makes testing the validity of functional dependencies fast even for a large number of tuples. The use of partitions also makes the discovery of approximate functional dependencies easy and efficient and the erroneous or exceptional rows can be identified easily. Experiments show that T ANE is fast in practice. For benchmark databases the running times are improved by several orders of magnitude over previously published results. The algorithm is also applicable to much larger datasets than the previous methods.
More
Translated text
PDF
Bibtex
AI Read Science
AI Summary
AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.
Example
Background
Key content
Introduction
Methods
Results
Related work
Fund
Key content
  • Pretraining has recently greatly promoted the development of natural language processing (NLP)
  • We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
  • We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
  • The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
  • Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance
Try using models to generate summary,it takes about 60s
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Related Papers

SQL-based Discovery of Exact and Approximate Functional Dependencies

Working group reports from ITiCSE on Innovation and technology in computer science education 2004

被引用14

An Optimized Approach for Extracting Approximate Functional Dependencies in XML Documents

Shi Lei,Yang Xiao-chun,Yu Ge,Wang Bin, Zhou Hua-hui
Wuhan University Journal of Natural Sciences 2006

被引用0

Using Transversals for Discovering XML Functional Dependencies

FOUNDATIONS OF INFORMATION AND KNOWLEDGE SYSTEMS, PROCEEDINGS 2008

被引用12

Extracting Functional Dependencies in Large Datasets Using MapReduce Model

International Journal of Intelligent Information Technologies 2014

被引用13

SAKey: Scalable Almost Key Discovery in RDF Data

The Semantic Web – ISWC 2014 Lecture Notes in Computer Science 2014

被引用72

Efficient Keyword Search Across Heterogeneous Relational Databases

2007 IEEE 23rd International Conference on Data Engineering 2007

被引用161

Relaxed Functional Dependencies—A Survey of Approaches

IEEE transactions on knowledge and data engineering 2016

被引用141

Efficient Order Dependency Detection.

Philipp Langer,Felix Naumann
The VLDB Journal 2015

被引用47

FD/spl I.bar/Mine: Discovering Functional Dependencies in a Database Using Equivalences

2002 IEEE International Conference on Data Mining, 2002 Proceedings

被引用57

Reducing End-User Burden in Everyday Data Organization

Deep Blue (University of Michigan) 2013

被引用23

CrowdMD: Crowdsourcing-based Approach for Deduplication

2015 IEEE International Conference on Big Data (Big Data) 2015

被引用7

Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper

要点】:本文提出了TANE算法,一种有效发现数据库中函数依赖关系的方法,其创新之处在于通过基于属性值对元组集进行分区,提高了测试函数依赖有效性的效率。

方法】:TANE算法通过分区方法优化了函数依赖的发现过程。

实验】:文章在大型数据库上测试了TANE算法,证明了其在处理大量元组时依然高效,但未提供具体数据集名称和详细实验结果。