On Higher-performance Sparse Tensor Transposition

IPDPS Workshops(2023)

引用 0|浏览1
暂无评分
摘要
Sparse tensor algebra has been an important tool in scientific research. However, there is limited study on sparse tensor transposition on its performance. To the best of our knowledge, sparse tensor transposition has been a critical performance bottleneck in sparse tensor algebra computations. In this work, we characterize the performance of sparse tensor transposition in a chain of distinct sparse tensor algebra operations and study sparse tensor transposition with, without sorting, and with various sort algorithms. We find that sparse tensor transposition with count sort performs best among other options. However, with the best performant count sort variant, sparse tensor transposition still becomes the performance bottleneck in tensor computations with mixed tensor operation types, such as Sparse-Tensor-Times-Vector (SpTTV). To resolve sparse tensor transpose as a bottleneck, we further investigate possible sparse tensor algorithm designs and propose an efficient new sparse tensor algorithm by integrating bucket sort and count sort. The evaluation results show that our algorithm is faster than the state-of-the-art (SOTA) approach, with 1.2x speedup on average and up to 2.3x speedup with multithreading. There is also potential to further improve the performance of sparse tensor transpose by selectively enabling the SOTA approach or ours on particular dimensions of coordinates with respect to the coordinate distribution.
更多
查看译文
关键词
sparse tensor algebra,sparse tensor transposition,Parallel sorting algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要