GE-SpMM: general-purpose sparse matrix-matrix multiplication on GPUs for graph neural networks

The International Conference for High Performance Computing, Networking, Storage, and Analysis(2020)

引用 88|浏览121
暂无评分
摘要
ABSTRACTThe acceleration of Graph Neural Networks (GNNs) requires efficient and framework-compatible Sparse-Dense Matrix-Matrix Multiplication (SpMM). From the compatibility perspective, the sophisticated sparse matrix representations in state-of-the-art SpMM designs cause heavy preprocessing overhead for the framework. From the efficiency perspective, optimizations for SpMV (Sparse Matrix-Vector) do not apply well to SpMM, leading to redundant and uncoalesced global memory access. We propose GE-SpMM1, which takes the CSR format consistent with GNN frameworks to enable integration without the format transformation overhead. We use Coalesced Row Caching to ensure coalesced access to both sparse and dense data in the global memory. We use Coarse-grained Warp Merging to reduce redundant data loading among GPU warps. Experiments on a real-world graph dataset demonstrate up to 1.41× speedup over Nvidia cuSPARSE [1] and up to 1.81× over GraphBLAST [2]. We embed GE-SpMM in GNN frameworks and get up to 3.67× speedup on popular GNN models like GCN [3] and GraphSAGE [4].
更多
查看译文
关键词
graph neural networks,compatibility perspective,sophisticated sparse matrix representations,sparse matrix-vector,uncoalesced global memory access,GE-SpMM1,GNN frameworks,format transformation overhead,sparse data,dense data,graph dataset,general-purpose sparse matrix-matrix multiplication,compatible sparse-dense matrix-matrix multiplication,GPU,coarse-grained warp merging,SpMV,optimizations,GraphSAGE,GCN,GNN,Nvidia cuSPARSE
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要