Speeding Up the Communications on a Cluster Using MPI by Means of Software Defined Networks
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE(2024)
摘要
The Open MPI library is widely employed for implementing the message-passing programming model on parallel applications running on distributed memory computer systems, such as large data centers. These applications aim to utilize the highest amount of resources required by High Performance Computing (HPC). The interconnection network is an essential part of the HPC environment, as processes on parallel applications are constantly communicating and sharing data. Software Defined Networking (SDN) is a different networking approach that separates the control plane from the data forwarding plane, which can be configured depending on the network status or specific requirements of parallel application communications. Given that the communication time significantly contributes to the overall execution time of a parallel program and considering the elapsed time during Open MPI initialization of TCP connections between processes in Ethernet networks, this paper proposes the integration of a software defined networking environment into the Open MPI library. The primary objective of our contribution is to provide the network controller with information about Open MPI processes, in order to configure the network during the initialization procedure of the Open MPI library. This may facilitate the development of SDN-based routing techniques that reduce communication times, and thus execution times, using application information, such as the Open MPI endpoints participating in a parallel program execution. To demonstrate the utility of the information provided by Open MPI processes, we have implemented a routing algorithm that will calculate the optimal paths between processes based on the weighted Dijkstra algorithm, using the number of flows traversing the topology links. The evaluation of the proposed mechanism utilizing a 2-stage fat tree topology and two parallel applications - a matrix product and the Model for Prediction Across Scales (MPAS) - showed significant improvements in execution time, with reductions of up to 2.5 times for a 4096 × 4096 matrix product and 1.3 times for an 8192 × 8192 matrix product, as well as a 1.5 times reduction for MPAS in the worst network occupancy scenario. This demonstrates the improvements in communication and therefore execution time.
更多查看译文
关键词
High performance computing,Software defined networks,Message-passing interface,Open MPI,Parallel computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要