Adaptive Statistics in Oracle 12c.

PVLDB(2017)

引用 17|浏览21
暂无评分
摘要
Database Management Systems (DBMS) continue to be the foundation of mission critical applications, both OLTP and Analytics. They provide a safe, reliable and efficient platform to store and retrieve data. SQL is the lingua franca of the database world. A database developer writes a SQL statement to specify data sources and express the desired result and the DBMS will figure out the most efficient way to implement it. The query optimizer is the component in a DBMS responsible for finding the best execution plan for a given SQL statement based on statistics, access structures, location, and format. At the center of a query optimizer is a cost model that consumes the above information and helps the optimizer make decisions related to query transformations, join order, join methods, access paths, and data movement.The final execution plan produced by the query optimizer depends on the quality of information used by the cost model, as well as the sophistication of the cost model. In addition to statistics about the data, the cost model also relies on statistics generated internally for intermediate results, e.g. size of the output of a join operation. This paper presents the problems caused by incorrect statistics of intermediate results, survey the existing solutions and present our solution introduced in Oracle 12c. The solution includes validating the generated statistics using table data and via the automatic creation of auxiliary statistics structures. We limit the overhead of the additional work by confining their use to cases where it matters the most, caching the computed statistics, and using table samples. The statistics management is automated. We demonstrate the benefits of our approach based on experiments using two SQL workloads, a benchmark that uses data from the Internal Movie Data Base (IMDB) and a real customer workload.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要