ACRE: Accelerating Random Forests for Explainability

56TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, MICRO 2023(2023)

引用 0|浏览2
暂无评分
摘要
As machine learning models become more widespread, they are being increasingly applied in applications that heavily impact people's lives (e.g., medical diagnoses, judicial system sentences, etc.). Several communities are thus calling for ML models to be not only accurate, but also explainable. To achieve this, recommendations must be augmented with explanations summarizing how each recommendation outcome is derived. Explainable Random Forest (XRF) models are popular choices in this space, as they are both very accurate and can be augmented with explainability functionality, allowing end-users to learn how and why a specific outcome was reached. However, the limitations of XRF models hamper their adoption, the foremost being the high computational demands associated with training such models to support high-accuracy classifications, while also annotating them with explainability meta-data. In response, we present ACRE, a hardware accelerator to support XRF model training. ACRE accelerates key operations that bottleneck performance, while maintaining meta-data critical to support explainability. It leverages a novel Processing-in-Memory hardware unit, co-located with banks of a 3D-stacked High-Bandwidth Memory (HBM). The unit locally accelerates the execution of key training computations, boosting effective data-transfer bandwidth. Our evaluation shows that, when ACRE augments HBM3 memory, it yields an average system-level training performance improvement of 26.6x, compared to a baseline multicore processor solution with DDR4 memory. Further, ACRE yields a 2.5x improvement when compared to an HBM3 architecture baseline, increasing to 5x when not bottlenecked by a 16k-thread limit in the host. Finally, due to much higher performance, we observe that ACRE provides a 16.5x energy reduction overall, over a DDR baseline.
更多
查看译文
关键词
Random Forest,High Performance,Machine Learning Models,Random Forest Model,Hardware Accelerators,Training Data,Feature Values,Parallelization,Number Of Comparisons,Data Transfer,Entropy Values,General Data Protection Regulation,Network Bandwidth,High Memory,Continuous Features,Memory Devices,Parallel Comparison,Memory Bandwidth,Memory Bank,Random Forest Training,AI Models,Output Bits,Cache Hit,Small Range Of Values,Bandwidth Demand,Approximate Entropy,Written Back,Parallel Threads,Explainable Artificial Intelligence,Memory System
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要