Enabling Reliable and Efficient Performance Monitoring and Troubleshooting in Datacenter Networks.

IWQoS(2023)

引用 1|浏览18
暂无评分
摘要
Network performance monitoring and troubleshooting is a crucial but challenging task in datacenter management. Despite the numerous solutions that have been proposed in recent years, their efforts are often hindered by high costs and unreliable fault localization, making it difficult to deploy them in real-world environments. In this paper, we present LMon, a highly reliable and efficient system for monitoring and troubleshooting in datacenter networks. LMon utilizes the characteristic of ECMP hashing linearity to control the packet routing without any modification of the underlying protocols. Additionally, LMon leverages a lightweight probing technique to reduce monitoring overhead. Furthermore, the system integrates improved LASSO regression and statistical hypothesis testing for higher accuracy and faster processing in link failure localization. The effectiveness of LMon is demonstrated through its implementation and evaluation in ns-3 simulation. The results validate the reliability and efficiency of the system, making it a promising option for ensuring long-term network maintenance in datacenters.
更多
查看译文
关键词
crucial but challenging task,datacenter management,datacenter networks,datacenters,efficient performance monitoring,enabling reliable,highly reliable system,lightweight probing technique,link failure localization,LMon leverages,long-term network maintenance,monitoring overhead,network performance monitoring,numerous solutions,packet routing,real-world environments,reliability,statistical hypothesis testing,troubleshooting,unreliable fault localization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要