Performance bottlenecks detection through microarchitectural sensitivity
CoRR(2024)
摘要
Modern Out-of-Order (OoO) CPUs are complex systems with many components
interleaved in non-trivial ways. Pinpointing performance bottlenecks and
understanding the underlying causes of program performance issues are critical
tasks to make the most of hardware resources.
We provide an in-depth overview of performance bottlenecks in recent OoO
microarchitectures and describe the difficulties of detecting them. Techniques
that measure resources utilization can offer a good understanding of a
program's execution, but, due to the constraints inherent to Performance
Monitoring Units (PMU) of CPUs, do not provide the relevant metrics for each
use case.
Another approach is to rely on a performance model to simulate the CPU
behavior. Such a model makes it possible to implement any new
microarchitecture-related metric. Within this framework, we advocate for
implementing modeled resources as parameters that can be varied at will to
reveal performance bottlenecks. This allows a generalization of bottleneck
analysis that we call sensitivity analysis.
We present Gus, a novel performance analysis tool that combines the
advantages of sensitivity analysis and dynamic binary instrumentation within a
resource-centric CPU model. We evaluate the impact of sensitivity on bottleneck
analysis over a set of high-performance computing kernels.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要