SIMD code generation for stencils on brick decompositions.
PPOPP(2018)
摘要
We present a stencil library and associated compiler code generation framework designed to maximize performance on higher-order stencil computations through the use of two main technologies: a fine-grained brick data layout designed to exploit the inherent multidimensional spatial locality endemic to stencil computations, and a vector scatter associative reordering transformation that reduces vector loads and alignment operations and exposes opportunities for the backend compiler to reduce computation. For a range of stencil computations, we compare the generated code expressed in the brick library to the standard tiled code. We attain up to a 7.2X speedup on the most complex stencils when running on an Intel Knights Landing (Xeon Phi) processor.
更多查看译文
关键词
SIMdization, compiler optimization, stencil
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要