Reliability-Aware Speedup Models for Parallel Applications with Coordinated Checkpointing/Restart

IEEE Trans. Computers(2015)

引用 29|浏览54
暂无评分
摘要
Speedup models are powerful analytical tools for evaluating and predicting the performance of parallel applications. Unfortunately, the well-known speedup models like Amdahl’s law and Gustafson’s law do not take reliability into consideration and therefore cannot accurately account for application performance in the presence of failures. In this study, we enhance Amdahl’s law and Gustafson’s law by considering the impact of failures and the effect of coordinated checkpointing/restart. Unlike existing analytical studies relying on Exponential failure distribution alone, in this work we consider both Exponential and Weibull failure distributions in the construction of our reliability-aware speedup models. The derived reliability-aware models are validated through trace-based simulations under a variety of parameter settings. Our trace-based simulations demonstrate these models can effectively quantify failure impact on application speedup. Moreover, we present two case studies to illustrate the use of these reliability-aware speedup models.
更多
查看译文
关键词
gustafson???s law,speedup,amdahl???s law,analytical modeling,reliability,exponential distribution,gustafson s law,parallel processing,amdahl s law,mathematical model,weibull distribution,computational modeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要