Pavise: Integrating Fault Tolerance Support for Persistent Memory Applications.

PACT(2022)

引用 0|浏览44
暂无评分
摘要
Persistent memory (PM) allows programmers to bypass the file system and efficiently manage persistent data directly. As a consequence, the application is now responsible for a non-trivial task---maintaining data crash consistency. In addition, it is highly desirable for today's production-grade storage systems to have fault tolerance to restore from data corruptions. Systems may provide fault tolerance through data redundancy. However, direct PM accesses bypass the system and make the data vulnerable to corruption. Without system-level support, it is the application's responsibility to maintain both crash consistency and fault tolerance, creating a demand for software tools to alleviate the burden from the application programmer. Providing fault tolerance is challenging in the absence of system support because it is difficult to track data updates and efficiently update data along with its redundancy in a crash-consistent manner. Existing fault-tolerant mechanisms for PM applications either impose significant programming restrictions to the programmer or compromise on the level of protection they provide. This paper designs and implements Pavise 1 , a software framework that provides protection for data within PM applications. Pavise uses a compiler pass to automatically track accesses to persistent data. It co-designs fault tolerance operations with the crash consistency mechanism to efficiently update data and its redundancy while maintaining the crash consistency guarantee. Pavise can be easily applied to existing PM applications with minimal manual effort and modest overheads. Our evaluation of common PM applications shows that Pavise achieves 83.2% (with ignore-list) and 70.9% (with conservative tracking) performance of the current state-of-the-art fault-tolerance software system, Pangolin. Because Pavise provides both application and library data with equally strong protection, Pavise can sustain a much higher error rate of 10 −5 as compared to Pangolin's 10 −7 .
更多
查看译文
关键词
persistent memory, crash consistency, redundancy, fault tolerance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要