Learning Flexible GEMM Accelerator Configuration and Mapping-space using ML

Ananda Samajdar, Michael Pellauer

semanticscholar(2022)

引用 0|浏览0
暂无评分
摘要
—The value of flexibility in Deep Learning accel- erators to adapt to diverse layer shapes and sizes is well-understood. Contemporary reconfigurable architectures depend on compilers or other components in the software stack for optimal configuration and mapping search to fully exploit the benefits of flexibility. In this paper we show that the configuration and mapping space of flexible accelerators can be learnt using machine learning by casting it as a classification or recommendation problem. The learnt model can be used to obtain the optimal configuration of the target accelerator in constant time without search . We propose A DAPT N ET , a recommender system for obtaining optimal configuration and mapping for GEMM workloads running on a R ECONFIGURABLE S YSTOLIC A RRAY ( RSA ). RSA is designed to be configured such that it can operate across a spectrum from a single monolithic array to a distributed collection of smaller arrays of various sizes with flexible aspect ratios. This allows us to simultaneously achieve scalability and high mapping flexibility while preserving operand reuse. A DAPT N ET demonstrates 95% test accuracy compared to an exhaustively searched optimal configuration, beating state-of-the-art classification techniques such as SVMs, XGBoost and MLPs. We also present, A DAPT N ET X, a specialized core to run A DAPT N ET in hardware. Together, RSA and A DAPT N ET X enable us to demonstrate a new class of flexible accelerators which are capable of self-configuring in hardware for the given GEMM workload. We present a 32.768 TOPS instance called SAGAR that is capable of providing the same mapping flexibility as a compute equivalent distributed system while achieving 3.5 × more power efficiency and 3.2
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要