NUMA-Aware Dense Matrix Factorizations and Inversion with Look-Ahead on Multicore Processors

2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)(2022)

引用 0|浏览8
暂无评分
摘要
We address the efficient design and implementation of dense matrix factorizations and inversion (DMFI) on modern multicore processors with several NUMA (non-uniform memory access) nodes. Our approach enhances the DMFI routines with a look-ahead strategy, in order to overcome the “panel factorization bottleneck”. In addition, it exploits both hybrid task- and loop-level parallelizations while taking into account the NUMA organization of the memory hierarchy. The experiments on a Huawei Kunpeng-based server, with two sockets and 48 cores per socket, for three representative dense linear algebra operations, expose the necessity of adapting both the codes and their execution environment parameters to improve data access locality. The results of these changes deliver performance across inter- and intra-socket NUMA configurations superior to that of reference implementations from state-of-the-art libraries for this platform.
更多
查看译文
关键词
Multicore processors,NUMA,dense linear algebra,look-ahead,multi-threaded parallelism,high performance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要