Characterizing the Influence of System Noise on Large-Scale Applications by Simulation.

SC(2010)

引用 312|浏览312
暂无评分
摘要
ABSTRACTThis paper presents an in-depth analysis of the impact of system noise on large-scale parallel application performance in realistic settings. Our analytical model shows that not only collective operations but also point-to-point communications influence the application's sensitivity to noise. We present a simulation toolchain that injects noise delays from traces gathered on common large-scale architectures into a LogGPS simulation and allows new insights into the scaling of applications in noisy environments. We investigate collective operations with up to 1 million processes and three applications (Sweep3D, AMG, and POP) with up to 32,000 processes.We show that the scale at which noise becomes a bottleneck is system-specific and depends on the structure of the noise. Simulations with different network speeds show that a 10x faster network does not improve application scalability. We quantify noise and conclude that our tools can be utilized to tune the noise signatures of a specific system.
更多
查看译文
关键词
multiprocessing systems,parallel architectures,AMG,LogGPS simulation,POP,Sweep3D,in-depth analysis,large-scale architectures,large-scale parallel application performance,noise delays,noise signatures,point-to-point communications,system noise,
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要