Clustering-Based Numerosity Reduction for Cloud Workload Forecasting

ALGORITHMIC ASPECTS OF CLOUD COMPUTING, ALGOCLOUD 2023(2024)

引用 0|浏览11
暂无评分
摘要
Finding smaller versions of large datasets that preserve the same characteristics as the original ones is becoming a central problem in Machine Learning, especially when computational resources are limited, and there is a need to reduce energy consumption. In this paper, we apply clustering techniques for wisely selecting a subset of datasets for training models for time series prediction of future workload in cloud computing. We train Bayesian Neural Networks (BNNs) and state-of-the-art probabilistic models to predict machine-level future resource demand distribution and evaluate them on unseen data from virtual machines in the Google Cloud data centre. Experiments show that selecting the training data via clustering approaches such as Self Organising Maps allows the model to achieve the same accuracy in less than half the time, requiring less than half the datasets rather than selecting more data at random. Moreover, BNNs can capture uncertainty aspects that can better inform scheduling decisions, which state-of-the-art time series forecasting methods cannot do. All the considered models achieve prediction time performance suitable for real-world scenarios.
更多
查看译文
关键词
Cloud Computing,Workload Prediction,Clustering,Bayesian Neural Network,Deep Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要