Unshackling Database Benchmarking from Synthetic Workloads.

ICDE(2023)

引用 0|浏览32
暂无评分
摘要
Introducing new (learned) features into a DBMS requires considerable experimentation and benchmarking to avoid regressions in production (customer) workloads. Using standard benchmarks such as TPC-H and TCH-DS is common practice, but, unfortunately, these do not represent the complexity of real production workloads. To solve this problem, in this demo, we propose a technique that generates a synthetic dataset from query logs and metadata—without touching the original data. The keystone of our approach is to map the data generation as a SAT problem where constraints, such as runtime cardinalities, are extracted from query logs and metadata. We show that our approach can generate representative benchmarks mirroring the performance of the original data without trading off privacy. The demo will guide the attendees through the various steps involved in the data generation and testing process.
更多
查看译文
关键词
synthetic data generation,benchmarking,SAT
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要