Replay Debugging For Distributed Applications
ATEC '06: Proceedings of the annual conference on USENIX '06 Annual Technical Conference(2006)
摘要
We have developed a new replay debugging tool, liblog, for distributed C/C++ applications. It logs the execution of deployed application processes and replays them deterministically, faithfully reproducing race conditions and non-deterministic failures, enabling careful offline analysis.To our knowledge, liblog is the first replay tool to address the requirements of large distributed systems: lightweight support for long-running programs, consistent replay of arbitrary subsets of application nodes, and operation in a mixed environment of logging and non-logging processes. In addition, it requires no special hardware or kernel patches, supports unmodified application executables, and integrates GDB into the replay mechanism for simultaneous source-level debugging of multiple processes.This paper presents liblog's design, an evaluation of its runtime overhead, and a discussion of our experience with the tool to date.
更多查看译文
关键词
consistent replay,new replay,replay mechanism,replay tool,application node,application process,unmodified application executables,arbitrary subsets,careful offline analysis,kernel patch
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络