Unshackling Database Benchmarking from Synthetic Workloads.

Parimarjan Negi,Laurent Bindschaedler,Mohammad Alizadeh,Tim Kraska,Jyoti Leeka,Anja Gruenheid,Matteo Interlandi

ICDE（2023）

引用 0|浏览32

暂无评分

摘要

Introducing new (learned) features into a DBMS requires considerable experimentation and benchmarking to avoid regressions in production (customer) workloads. Using standard benchmarks such as TPC-H and TCH-DS is common practice, but, unfortunately, these do not represent the complexity of real production workloads. To solve this problem, in this demo, we propose a technique that generates a synthetic dataset from query logs and metadata—without touching the original data. The keystone of our approach is to map the data generation as a SAT problem where constraints, such as runtime cardinalities, are extracted from query logs and metadata. We show that our approach can generate representative benchmarks mirroring the performance of the original data without trading off privacy. The demo will guide the attendees through the various steps involved in the data generation and testing process.

查看译文

关键词

synthetic data generation,benchmarking,SAT

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要