Modeling Large-Scale Slim Fly Networks Using Parallel Discrete-Event Simulation.

ACM Trans. Model. Comput. Simul.(2018)

引用 6|浏览68
暂无评分
摘要
As supercomputers approach exascale performance, the increased number of processors translates to an increased demand on the underlying network interconnect. The slim fly network topology, a new low-diameter, low-latency, and low-cost interconnection network, is gaining interest as one possible solution for next-generation supercomputing interconnect systems. In this article, we present a high-fidelity slim fly packet-level model leveraging the Rensselaer Optimistic Simulation System (ROSS) and Co-Design of Exascale Storage (CODES) frameworks. We validate the model with published work before scaling the network size up to an unprecedented 1 million compute nodes and confirming that the slim fly observes peak network throughput at extreme scale. In addition to synthetic workloads, we evaluate large-scale slim fly models with real communication workloads from applications in the Design Forward program with over 110,000 MPI processes. We show strong scaling of the slim fly model on an Intel cluster achieving a peak network packet transfer rate of 2.3 million packets per second and processing over 7 billion discrete events using 128 MPI tasks. Enabled by the strong performance capabilities of the model, we perform a detailed application trace and routing protocol performance study. Through analysis of metrics such as packet latency, hop count, and congestion, we find that the slim fly network is able to leverage simple minimal routing and achieve the same performance as more complex adaptive routing for tested DOE benchmark applications.
更多
查看译文
关键词
Slim fly, interconnection networks, network topologies, parallel discrete event simulation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要