Moses: Exploiting Cross-Device Transferable Features for on-Device Tensor Program Optimization.

Zhihe Zhao,Xian Shuai,Neiwen Ling,Nan Guan,Zhenyu Yan,Guoliang Xing

HotMobile（2023）

引用 0|浏览22

暂无评分

摘要

Achieving efficient execution of machine learning models on mobile/edge devices has attracted significant attention recently. A key challenge is to generate high-performance tensor programs for each operator inside a DNN model efficiently. To this end, deep learning compilers have adopted auto-tuning approaches such as Ansor. However, it is challenging to optimize tensor codes for mobile/edge devices by auto-tuning due to limited time budgets and on-device resources. A key component of DNN compilers is the cost model that can predict the performance of each configuration on specific devices. However, current design of cost models cannot provide transferable features among different hardware accelerators efficiently and effectively. In this paper, we propose Moses, a simple yet efficient design based on the lottery ticket hypothesis, which fully takes advantage of the hardware-agnostic features transferable to the target device via domain adaptation to optimize the time-consuming auto-tuning process of DNN compiling on a new hardware platform. Compared with state-of-the-art approaches, Moses achieves up to 1.53X efficiency gain in the search stage and 1.41X inference speedup on challenging DNN benchmarks.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要