Effective Resource Management For Enhancing Performance Of 2d And 3d Stencils On Gpus

9TH WORKSHOP ON GENERAL PURPOSE PROCESSING USING GPUS (GPGPU 9)(2016)

引用 18|浏览4
暂无评分
摘要
GPUs are an attractive target for data parallel stencil computations prevalent in scientific computing and image processing applications. Many tiling schemes, such as overlapped tiling and split tiling, have been proposed in past to improve the performance of stencil computations. While effective for 2D stencils, these techniques do not achieve the desired improvements for 3D stencils due to the hardware constraints of GPU.A major challenge in optimizing stencil computations is to effectively utilize all resources available on the GPU. In this paper we develop a tiling strategy that makes better use of resources like shared memory and register file available on the hardware. We present a systematic methodology to reason about which strategy should be employed for a given stencil and also discuss implementation choices that have a significant effect on the achieved performance. Applying these techniques to various 2D and 3D stencils gives a performance improvement of 200-400% over existing tools that target such computations.
更多
查看译文
关键词
Resource management, Stencil computations, Tiling, GPGPU
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要