Towards automation of computing fabrics using tools from the fabric management workpackage of the EU DataGrid project
Clinical Orthopaedics and Related Research(2003)
摘要
The EU DataGrid project workpackage 4 has as an objective to provide the
necessary tools for automating the management of medium size to very large
computing fabrics. At the end of the second project year subsystems for
centralized configuration management (presented at LISA'02) and
performance/exception monitoring have been delivered. This will soon be
augmented with a subsystem for node installation and service configuration,
which is based on existing widely used standards where available (e.g. rpm,
kickstart, init.d scripts) and clean interfaces to OS dependent components
(e.g. base installation and service management). The three subsystems together
allow for centralized management of very large computer farms. Finally, a fault
tolerance system is being developed for tying together the above subsystems to
form a complete framework for automated enterprise computing management by
3Q03. All software developed is open source covered by the EU DataGrid project
license agreements. This article describes the architecture behind the designed
fabric management system and the status of the different developments. It also
covers the experience with an existing tool for automated configuration and
installation that have been adapted and used from the beginning to manage the
EU DataGrid testbed, which is now used for LHC data challenges.
更多查看译文
关键词
configuration management,management system,cluster computing,fault tolerant system,software development,service management
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要