Optimizing system monitoring configurations for non-actionable alerts

NOMS(2012)

引用 30|浏览32
暂无评分
摘要
Today's competitive business climate and the complexity of IT environments dictate efficient and cost effective service delivery and support of IT services. This is largely achieved through automating of routine maintenance procedures including problem detection, determination and resolution. System monitoring provides effective and reliable means for problem detection. Coupled with automated ticket creation, it ensures that a degradation of the vital signs, defined by acceptable thresholds or monitoring conditions, is flagged as a problem candidate and sent to supporting personnel as an incident ticket. This paper describes a novel methodology and a system for minimizing non-actionable tickets while preserving all tickets which require corrective action. Our proposed method defines monitoring conditions and the optimal corresponding delay times based on an off-line analysis of historical alerts and the matching incident tickets. Potential monitoring conditions are built on a set of predictive rules which are automatically generated by a rule-based learning algorithm with coverage, confidence and rule complexity criteria. These conditions and delay times are propagated as configurations into run-time monitoring systems.
更多
查看译文
关键词
automating routine maintenance procedures,offline analysis,knowledge based systems,rule-based learning algorithm,coverage criteria,problem resolution,run-time monitoring systems,learning (artificial intelligence),incident ticket,it environments,nonactionable ticket minimization,problem detection,automated ticket creation,software maintenance,historical alerts,computational complexity,corrective action,monitoring conditions,nonactionable alerts,rule complexity criteria,confidence criteria,delay times,system monitoring configuration optimization,it services,incident ticket matching,system monitoring,problem determination,cost effective service delivery,learning artificial intelligence,service delivery,testing,cost effectiveness,rule based,servers,prediction algorithms,accuracy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要