InCheck: An in-application recovery scheme for soft errors

DAC(2017)

引用 16|浏览22
暂无评分
摘要
An ideal solution for soft error tolerance should hide the effect of soft errors from user and provide correct results at expected time. Software solutions are attractive because they can provide flexible reliability without imposing any hardware modifications. Our investigation of state-of-the-art error recovery techniques reveals that they suffer from poor coverage (ability to detect and correctly recover from soft errors). This paper presents InCheck (In-application Checkpointing and Recovery) as an effective, safe and timely software technique for complete error coverage. The key features of InCheck are: verified register preservation, single memory location checkpoints, and safe & timely recovery. To evaluate the effectiveness of InCheck, we performed more than 210,000 fault injection experiments on different hardware components of an ARM cortex53-like processor running MiBench applications. The original and SWIFT-R (state-of-the-art) protected programs suffered from 8000 and 1800 instances of wrong outputs respectively, but when protected by InCheck, there was no failure.
更多
查看译文
关键词
soft error tolerance,software solutions,InCheck,error recovery,software technique,error coverage,in-application checkpointing and recovery,verified register preservation,memory location checkpoints,fault injection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要