谷歌浏览器插件
订阅小程序
在清言上使用

HiRe: using hint & release to improve synchronization of speculative threads.

ICS(2012)

引用 0|浏览13
暂无评分
摘要
ABSTRACTThread-Level Speculation (TLS) is a promising technique for improving performance of serial codes on multi-cores by automatically extracting threads and running them in parallel. However, the speculation efficiency as well as the performance gain of TLS systems are reduced by cross-thread data dependence violations. Reducing the cost and frequency of violations are key to improving the efficiency of TLS. One method to keep a dependence from violating is to predict it and communicate the value via synchronization. However, prior work in this field still cannot handle enough violating dependences, especially hard-to-predict ones and those in non-loop TLS tasks. Also, they suffer from over-synchronization and/or introduce complicated hardware. The major reason is that these techniques are highly sensitive to the accuracy of the dependence prediction, which is hard to improve in the face of irregular dependence and task patterns. In this paper, we propose a novel synchronization technique that avoids over synchronization and works for irregularly occurring dependences. We use a profiler to find and mark store-load pairs that generate data dependences. Then, the compiler schedules a hint instruction in advance of the store to inform successor threads of a possible pending write to a specific address; in this way, later loads only wait for a store if the loading location has been hinted. The compiler also schedules a release instruction that notifies the load when it should proceed. It places the release both after the store and on every path leading away from the hint that does not pass through the store. By placing it on all such paths, we limit the cost due to over synchronization. Together, the hint and release form our proposal, called HiRe. We implemented the HiRe scheme on a well-tuned TLS system and evaluated it on a set of SPEC CPU 2000 applications; we find that HiRe suffers only 22% of the violations that occur in our base TLS system, and it cuts the instruction waste rate of TLS in half. Furthermore, it outperforms prior approaches we studied by 3%.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要