Design and Analysis of Benchmarking Experiments for Distributed Internet Services
WWW(2015)
摘要
The successful development and deployment of large-scale Internet services depends critically on performance. Even small regressions in processing time can translate directly into significant energy and user experience costs. Despite the widespread use of distributed server infrastructure (e.g., in cloud computing and Web services), there is little research on how to benchmark such systems to obtain valid and precise inferences with minimal data collection costs. Correctly A/B testing distributed Internet services can be surprisingly difficult because interdependencies between user requests (e.g., for search results, social media streams, photos) and host servers violate assumptions required by standard statistical tests. We develop statistical models of distributed Internet service performance based on data from Perflab, a production system used at Facebook which vets thousands of changes to the company's codebase each day. We show how these models can be used to understand the tradeoffs between different benchmarking routines, and what factors must be taken into account when performing statistical tests. Using simulations and empirical data from Perflab, we validate our theoretical results, and provide easy-to-implement guidelines for designing and analyzing such benchmarks.
更多查看译文
关键词
a/b testing,benchmarking,bootstrapping,cloud computing,experimental design
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络