Exploring Resource Migration Using the CephFS Metadata Cluster

Michael Sevilla,Scott Brandt,Carlos Maltzahn,Ike Nassi,Sam Fineberg

semanticscholar（2014）

引用 0|浏览8

暂无评分

摘要

Understanding the effects of migrating resources is an important part of load balancing. Today’s systems can already virtualize memory and the ability to migrate other resources, such as CPU, disks, and network, is fast approaching. When we finally have the ability to migrate different resources, how do we know when and where to move them? Such migration will depend on the utilization, configuration, and workload, but how will we weight these factors to design robust, guaranteeable systems? In this work, we propose using metadata management as a substrate for exploring different heuristics for resource migration and load balancing. We hypothesize that an effective metadata management strategy will also depend on the utilization, configuration, and workload. POSIX-compliant systems are important for legacy software and users accustomed to hierarchical file systems. Unfortunately, file metadata is highly accessed and does not scale for sufficiently large systems in the same way that read and write throughput do [1, 3]. File metadata is very different from regular data; the need to distribute it amongst many nodes is not a result of its size, but its popularity. Maintaining a file system hierarchy and file attributes is notoriously difficult in highperformance computing (HPC), where checkpointing behavior induces “flash crowds” of clients simultaneously opening, writing, and destroying files in the same vicinity (e.g., a directory). The “big data” era has rendered proven metadata management techniques insufficient for metadata-intensive workloads. For example, Google had to add support for multiple masters to manage metadata because today’s workloads often deal with many small files (e.g., log processing) and a large amount of simultaneous clients (e.g., MapReduce jobs) [2]. Suddenly, the metadata problem, once reserved for HPC, has found its way into large data centers. While hash-based metadata management and object stores do well to evenly distribute metadata and its load, they sacrifice the locality inherent in hierarchical file systems. Caching popular inodes can help improve locality, but this technique is limited by the size of the caches and only stores data that has already been seen, instead of data that is related. We use the Ceph file system (CephFS) as a platform for attacking the metadata management problem because it was built with locality in mind and the tools for resource migration and hotspot detection are already implemented.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要