MATAR: A performance portability and productivity implementation of data-oriented design with Kokkos

Journal of Parallel and Distributed Computing(2021)

引用 6|浏览7
暂无评分
摘要
There is a need for simple, fast, and memory-efficient multidimensional data structures for dense and sparse storage that arise with numerical methods and in software applications. The data structures must perform equally well across multiple computer architectures, including CPUs and GPUs. For this purpose, we developed MATAR, a C++ software library that allows for simple creation and use of intricate data structures that is also portable across disparate architectures using Kokkos. The performance aspect is achieved by forcing contiguous memory layout (or as close to contiguous as possible) for multidimensional and multi-size dense or sparse MATrix and ARray (hence, MATAR) types. Our results show that MATAR has the capability to improve memory utilization, performance, and programmer productivity in scientific computing. This is achieved by fitting more work into the available memory, minimizing memory loads required, and by loading memory in the most efficient order. This document describes the purpose of the work, the implementation of each of the data types, and the resulting performance both in some simple baseline test cases and in an application code.
更多
查看译文
关键词
Performance,Portability,Productivity,Memory efficiency,GPUs,Dense and sparse storage
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要