An Affine Scheduling Framework for Integrating Data Layout and Loop Transformations

LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, LCPC 2020（2022）

引用 3|浏览14

暂无评分

摘要

Code transformations in optimizing compilers can often be classified as loop transformations that change the execution order of statement instances and data layout transformations that change the memory layouts of variables. There is a mutually dependent relationship between the two, i.e., the best statement execution order can depend on the underlying data layout and vice versa. Existing approaches have typically addressed this inter-dependency by picking a specific phase order, and can thereby miss opportunities to co-optimize loop transformations and data layout transformations. In this paper, we propose a cost-based integration of loop and data layout transformations, aiming to cover a broader optimization space than phase-ordered strategies and thereby to find better solutions. Our approach builds on the polyhedral model, and shows how both loop and data layout transformations can be represented as affine scheduling in a unified manner. To efficiently explore the broader optimization space, we build analytical memory and computational cost models that are parameterized with a range of machine features including hardware parallelism, cache and TLB locality, and vectorization. Experimental results obtained on 12-core Intel Xeon and 24-core IBM POWER8 platforms demonstrate that, for a set of 22 Polybench benchmarks, our proposed cost-based integration approach can respectively deliver 1.3x and 1.6x geometric mean improvements over a state-of-the-art polyhedral optimizer, PLuTo, and a 1.2x geometric mean improvement on both platforms over a phase-ordered approach in which loop transformations are followed by the best data layout transformations.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要