A Theoretical Model For Global Optimization Of Parallel Algorithms

MATHEMATICS(2021)

引用 7|浏览9
暂无评分
摘要
With the quickly evolving hardware landscape of high-performance computing (HPC) and its increasing specialization, the implementation of efficient software applications becomes more challenging. This is especially prevalent for domain scientists and may hinder the advances in large-scale simulation software. One idea to overcome these challenges is through software abstraction. We present a parallel algorithm model that allows for global optimization of their synchronization and dataflow and optimal mapping to complex and heterogeneous architectures. The presented model strictly separates the structure of an algorithm from its executed functions. It utilizes a hierarchical decomposition of parallel design patterns as well-established building blocks for algorithmic structures and captures them in an abstract pattern tree (APT). A data-centric flow graph is constructed based on the APT, which acts as an intermediate representation for rich and automated structural transformations. We demonstrate the applicability of this model to three representative algorithms and show runtime speedups between 1.83 and 2.45 on a typical heterogeneous CPU/GPU architecture.
更多
查看译文
关键词
parallel programming, parallel patterns, program optimization, optimizing framework, dependence analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要