Evaluating Attainable Memory Bandwidth Of Parallel Programming Models Via Babelstream

INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING(2018)

引用 32|浏览20
暂无评分
摘要
Many scientific codes consist of memory bandwidth bound kernels. One major advantage of many-core devices such as general purpose graphics processing units (GPGPUs) and the Intel Xeon Phi is their focus on providing increased memory bandwidth over traditional CPU architectures. Peak memory bandwidth is usually unachievable in practice and so benchmarks are required to measure a practical upper bound on expected performance. We augment the standard STREAM kernels with a dot product kernel to investigate the performance of simple reduction operations on large arrays. The choice of programming model should ideally not limit the achievable performance on a device. BabelStream (formally GPU-STREAM) has been updated to incorporate a wide variety of the latest parallel programming models, all implementing the same parallel scheme. As such this tool can be used as a kind of Rosetta Stone which provides both a cross-platform and cross-programming model array of results of achievable memory bandwidth.
更多
查看译文
关键词
performance portability, many-core, parallel programming models, memory bandwidth benchmark
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要