谷歌浏览器插件
订阅小程序
在清言上使用

High Performance MPI on IBM 12x InfiniBand Architecture

IEEE International Parallel and Distributed Processing Symposium(2007)

引用 15|浏览5
暂无评分
摘要
InfiniBand is becoming increasingly popular in the area of cluster computing due to its open standard and high performance. I/O interfaces like PCI-express and GX+ are being introduced as next generation technologies to drive InfiniBand with very high throughput. HCAs with throughput of 8x on PCI-express have become available. Recently, support for HCAs with 12x throughput on GX+ has been announced. In this paper, we design a message passing interface (MPI) on IBM 12x dual-port HCAs, which consist of multiple send/recv engines per port. We propose and study the impact of various communication scheduling policies (binding, striping and round robin). Based on this study, we present a new policy, EPC (enhanced point-to-point and collective), which incorporates different kinds of communication patterns; point-to-point (blocking, non-blocking) and collective communication, for data transfer. We implement our design and evaluate it with micro-benchmarks, collective communication and NAS parallel benchmarks. Using EPC on a 12x InfiniBand cluster with one HCA and one port, we can improve the performance by 41% with pingpong latency test and 63-65% with the unidirectional and bi-directional bandwidth tests, compared with the default single-rail MPI implementation. Our evaluation on NAS parallel benchmarks shows an improvement of 7-13% in execution time for integer sort and Fourier transform.
更多
查看译文
关键词
Fourier transforms,application program interfaces,computer architecture,message passing,peripheral interfaces,Fourier transform,HCA,IBM 12x InfiniBand architecture,PCI-express,application program interface,cluster computing,communication scheduling policy,data transfer,high performance MPI,message passing,peripheral interface
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要