A Chaos Recommendation Tool for Reliability Testing in Large-Scale Cloud-Native Systems.

Mudit Verma,Sandeep Hans,Diptikalyan Saha,Praveen Jayachandran,Eitan Farchi, Naga Ravi Chaitanya Elluri, Tullio Sebastiani, Paige Rubendall, Yogananth Subramanian, Pradeep Surisetty, Brian Riordan

International Conference on Communication Systems and Networks(2024)

引用 0|浏览0
暂无评分
摘要
With the proliferation of cloud-native systems supported by container technology and the widespread deployment of 5G and Edge use-cases, modern applications have become increasingly distributed and complex, often consisting of hundreds of components. Ensuring the reliability of these workloads has grown increasingly intricate as a consequence, only further complicated by the continuous evolution of systems supported by CI/CD practices. In this context, Chaos Engineering can play a crucial role in assessing the reliability of these large-scale systems by intentionally introducing adverse conditions and gauging their resilience in inter-connected environments. This controlled approach enables organizations to identify and learn from potential failure points before they escalate into full-blown service degradation and production outages. Yet, the effectiveness of chaos testing hinges on the relevance of the targeted fault scenarios and often relies on arbitrary or intuitive fault injection practices, leading to inefficiencies and suboptimal outcomes. Addressing these challenges, we have developed a chaos-recommendation tool. This tool assesses the real-time behavior and characteristics of workloads and suggests fault injections that can cause disruptions. In this demo, we will illustrate how the Chaos recommendation tool can be used to automatically identify potential failure points for a system and suggest corresponding chaos test cases. This tool, part of Redhat's Chaos Engineering project Kraken, is open-source and available at: https://github.com/redhat-chaos/krkn/blob/main/utils/chaos_recommender/README.md
更多
查看译文
关键词
Reliability Assurance,Chaos,Cloud-Native,AI
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要