谷歌浏览器插件
订阅小程序
在清言上使用

ExaStencils : Advanced Stencil-Code Engineering — First Project Report —

semanticscholar(2014)

引用 4|浏览0
暂无评分
摘要
Project ExaStencils pursues a radically new approach to stencil-code engineering. Present-day stencil codes are implemented in general-purpose programming languages, such as Fortran, C, or Java, or derivates thereof, and harnesses for parallelism, such as OpenMP, OpenCL or MPI. ExaStencils favors a much more domain-specific approach with languages at several layers of abstraction, the most abstract being the mathematical formulation, the most concrete the optimized target code. At every layer, the corresponding language expresses not only computational directives but also domain knowledge of the problem and platform to be leveraged for optimization. This approach will enable a highly automated code generation at all layers and has been demonstrated successfully before in the U.S. projects FFTW and SPIRAL for certain linear transforms. 1 The Challenges of Exascale Computing The performance of supercomputers is on the way from petascale to exascale. Software technology for high-performance computing has been struggling to keep up with the advances in computing power, from terascale in 1996 to petascale in 2009 on to exascale, now being only a factor of 30 away and predicted for the end of the present decade. So far, traditional host languages, such as Fortran and C, being equipped with harnesses for parallelism, such as MPI and OpenMP, have taken most of the burden, and they are being developed further with some new abstractions, notably the partitioned global address space (PGAS) memory model [1] in the languages Coarray Fortran [30], Chapel [9], Fortress [38], Unified Parallel C [8] or X10 [10]. Yet, the sequential host languages remain generalpurpose: Fortran or C or, if object orientation is desired, C++ or Java. The step from petascale to exascale performance challenges present-day software technology much more than the advances from gigascale to terascale and terascale to petascale have. The reason is the explicit treatment of the massive parallelism inside one node of a high-performance cluster cannot be avoided any longer. That is, the cluster nodes must be manycores with high numbers of cores. The reorientation of the computer market from single cores to multicores and manycores has been observed with concern [29]. In the high-performance market, the situation is somewhat alleviated by the fact that the additional cycles that large numbers of cores provide are actually being yearned for. But, the question of how to exploit them with efficient and robust software remains. While the potential for massive parallelism on and off the chip is the single most serious challenge to exascale software technology, other challenges take on a high priority and are frequently being mentioned, such as power conservation, fault tolerance and heterogeneity of the execution platform [2]. At best, one would strive for performance portability, i.e., the ability to switch the software with ease from one platform, when it is being decommissioned, to the next, while maintaining highest performance. 2 ExaStencils Application Domain: Stencil Codes Stencil codes have extremely high significance and value for a good-sized community of scientific-computing experts in academia and industry. They see widespread use in solving the systems arising form a discretization of partial differential equations (PDE) and systems composed of such equations. For the implementation of scalable stencil codes, the foremost requirement is to use of efficient solution algorithms, i.e., iterative solvers that rely on the application of a stencil and that provide good convergence properties. Major application areas are the natural sciences and engineering. Stencil codes are algorithms with a pleasantly high regularity: the data structures are higher-dimensional grids, and the computations follow a static, locally contained dependence pattern and are typically arranged in nested loops with linearly affine bounds. This invites massive parallelism and raises the hope for easily achieved high performance. However, serious challenges remain: – Because of the large numbers and varieties of stencil code implementations, deriving each of them individually—even if by code modification from one another—is not practical. Not even the use of program libraries is practical; instead, a domain-specific metaprogramming approach is needed. – Efficiency, i.e., a high ratio of speedup to the degree of parallelism, is impaired by the low computational intensity, i.e., the low ratio of computation steps to data transfers of stencil codes. – An unsuitable use of the execution platform may act as a performance brake. 3 ExaStencils Approach: Domain-Specific Optimization With project ExaStencils, we propose a radical departure from the traditional way of developing stencil codes. To this end, we make two major decisions.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要