Unfair Scheduling Patterns in NUMA Architectures
2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT)(2019)
摘要
Lock-free algorithms are typically designed and analyzed with adversarial scheduling in mind. However, on real hardware, lock-free algorithms perform much better than the adversarial assumption predicts, suggesting that adversarial scheduling is unrealistic. In pursuit of more realistic analyses, recent work has studied lock-free algorithms under gentler scheduling models. This begs the question: what concurrent scheduling models are realistic? This issue is complicated by the intricacies of modern hardware, such as cache coherence protocols and non-uniform memory access (NUMA). In this paper, we thoroughly investigate concurrent scheduling on real hardware. To do so, we introduce Severus, a new benchmarking tool that allows the user to specify a lock-free workload in terms of the locations accessed and the cores participating. Severus measures the performance of the workload and logs enough information to reconstruct an execution trace. We demonstrate Severus's capabilities by uncovering the scheduling details of two NUMA machines with different microarchitectures: one AMD Opteron 6278 machine, and one Intel Xeon CPU E7-8867 v4 machine. We show that the two architectures yield very different schedules, but both exhibit unfair executions that skew toward remote nodes in contended workloads.
更多查看译文
关键词
NUMA,contention,scheduler
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络