基本信息
浏览量:372
职业迁徙
个人简介
My doctoral research focused on understanding how commodity file systems handle various types of modern disk failures and examined ways to build file systems with Internal RObustNess (IRON) against disk errors.
Disk drives are widely used as a primary medium for storing information. While commodity file systems trust disks to either work or fail completely, modern disks exhibit complex failure modes such as latent sector faults and block corruptions, where only portions of a disk fail. In this thesis, we focus on understanding the failure policies of file systems and improving their robustness to disk failures. We suggest a new fail-partial failure model for disks, which incorporates realistic localized faults such as latent sector faults and block corruption. We then develop and apply a novel semantic failure analysis technique, which uses file system block type knowledge and transactional semantics, to inject interesting faults and investigate how commodity file systems react to a range of more realistic disk failures.
We apply our technique to five important journaling file systems: Linux ext3, ReiserFS, JFS, XFS, and Windows NTFS. We classify their failure policies in a new taxonomy that measures their Internal RObustNess (IRON), which includes both failure detection and recovery techniques. Our analysis results show that commodity file systems store little or no redundant information, and contain failure policies that are often inconsistent, sometimes buggy, and generally inadequate in their ability to recover from partial disk failures.
We remedy the reliability short comings in commodity file systems by addressing two issues. First, we design new low-level redundancy techniques that a file system can use to handle disk faults. We begin by qualitatively and quantitatively evaluating various redundancy information such as checksum, parity, and replica, Then, in order to account for spatially correlated faults, we propose a new probabilistic model that can be used to construct redundancy sets Finally, we describe two update strategies: a overwrite and no-overwrite approach that a file system can use to update its data and parity blocks atomically without NVRAM support. Overall, we show that low-level redundant information can greatly enhance file system robustness while incurring modest time and space overheads.
Second, to remedy the problem of failure handling diffusion, we develop a modified ext3 that unifies all failure handling in a centralized module. We then showcase the power of centralized failure handling in ext3c, a modified IRON version of ext3 that uses centralized module by demonstrating its support for flexible, consistent, and fine-grained policies. By carefully separating policy from mechanism, ext3c demonstrates how a file system can provide a thorough, comprehensive, and easily understandable failure-handling policy.
Disk drives are widely used as a primary medium for storing information. While commodity file systems trust disks to either work or fail completely, modern disks exhibit complex failure modes such as latent sector faults and block corruptions, where only portions of a disk fail. In this thesis, we focus on understanding the failure policies of file systems and improving their robustness to disk failures. We suggest a new fail-partial failure model for disks, which incorporates realistic localized faults such as latent sector faults and block corruption. We then develop and apply a novel semantic failure analysis technique, which uses file system block type knowledge and transactional semantics, to inject interesting faults and investigate how commodity file systems react to a range of more realistic disk failures.
We apply our technique to five important journaling file systems: Linux ext3, ReiserFS, JFS, XFS, and Windows NTFS. We classify their failure policies in a new taxonomy that measures their Internal RObustNess (IRON), which includes both failure detection and recovery techniques. Our analysis results show that commodity file systems store little or no redundant information, and contain failure policies that are often inconsistent, sometimes buggy, and generally inadequate in their ability to recover from partial disk failures.
We remedy the reliability short comings in commodity file systems by addressing two issues. First, we design new low-level redundancy techniques that a file system can use to handle disk faults. We begin by qualitatively and quantitatively evaluating various redundancy information such as checksum, parity, and replica, Then, in order to account for spatially correlated faults, we propose a new probabilistic model that can be used to construct redundancy sets Finally, we describe two update strategies: a overwrite and no-overwrite approach that a file system can use to update its data and parity blocks atomically without NVRAM support. Overall, we show that low-level redundant information can greatly enhance file system robustness while incurring modest time and space overheads.
Second, to remedy the problem of failure handling diffusion, we develop a modified ext3 that unifies all failure handling in a centralized module. We then showcase the power of centralized failure handling in ext3c, a modified IRON version of ext3 that uses centralized module by demonstrating its support for flexible, consistent, and fine-grained policies. By carefully separating policy from mechanism, ext3c demonstrates how a file system can provide a thorough, comprehensive, and easily understandable failure-handling policy.
研究兴趣
论文共 55 篇作者统计合作学者相似作者
按年份排序按引用量排序主题筛选期刊级别筛选合作者筛选合作机构筛选
时间
引用量
主题
期刊级别
合作者
合作机构
EUROSYSpp.1-14, (2014)
SOSPpp.309-324, (2013)
user-5f3206704c775e3a7964bd8b(2013)
引用0浏览0引用
0
0
user-5f3206704c775e3a7964bd8b(2013)
引用0浏览0引用
0
0
NSDI'12: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementationpp.1-14, (2012)
引用229浏览0EI引用
229
0
加载更多
作者统计
合作学者
合作机构
D-Core
- 合作者
- 学生
- 导师
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn