Performance, Portability and Productivity for Room Acoustics Codes

semanticscholar(2017)

引用 0|浏览6
暂无评分
摘要
The parallel programming landscape is becoming so vast and complex that developing codes which are performant, portable and easily programmable tends to require compromising somewhere. Current solutions have focused on abstracting away parallelisation instructions from algorithms using libraries to hide details of hardware specific code and low-level syntax. However, many of these solutions have been tested only on simple benchmarks, are too low-level for productive use or focus more on hardware specific optimisations than specific codes. This project proposes a solution developed from the bottom up, beginning with the algorithm of interest a room acoustics simulation for modeling sound in an enclosed 3D space and developing a performance portable and productive framework from it. Such a framework has been developed with the flexibility to swap in optimisations and data abstractions. The performance, portability and productivity of this framework improves on or is comparable to native and new implementations of the original benchmark it is based on. Tests were run across six different platforms (CPU, two NVIDIA GPUs, two AMD GPUs and Xeon Phi) comparing four implementations (CUDA, OpenCL, targetDP library and the new abstracted framework). Results showed that the abstract framework is performant within 23% of the best version on each of the platforms in the worst instance. It also runs 40% faster (45% when optimised) on a previously inaccessible platform as compared to the original benchmark on the platform it was optimised for. Additionally, comparisons of more advanced versions of the benchmarks with the original versions determined that there is a similar performance profile when the room acoustics algorithms become more complex. This research has shown that it is possible to develop more productive and performance portable codes for room acoustics simulations which simplify the problems of writing programmable and performant code across different platforms. This work can be extended to other physical simulations using similar algorithms. Going forward, research could also be done to extend this work into a DSL for grid-based simulations (like room acoustics). Furthermore, this DSL could be brought into existing frameworks to compile HPC simulations to architecture specific optimised code.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要