A Virtual File System for On-Demand Processing of Multidimensional Datasets.

XSEDE(2016)

引用 0|浏览1
暂无评分
摘要
Diverse areas of science and engineering are increasingly driven by high-throughput automated data capture and analysis. Modern acquisition technologies, used in many scientific applications (e.g., astronomy, physics, materials science, geology, biology, and engineering) and often running at gigabyte per second data rates, quickly generate terabyte to petabyte datasets that must be stored, shared, processed and analyzed at similar rates. The largest datasets are often multidimensional, such as volumetric and time series data derived from various types of image capture. Cost-effective and timely processing of these data require system and software architectures that incorporate on-the-fly processing to minimize I/O traffic and avoid latency limitations. In this paper we present the Virtual Volume File System, a new approach to on-demand processing with file system semantics, combining these principles into a versatile and powerful data pipeline for dealing with some of the largest 3D volumetric datasets. We give an example of how we have started to use this approach in our work with massive electron microscopy image stacks. We end with a short discussion of current and future challenges.
更多
查看译文
关键词
active storage file system, multidimensional data processing, near-data computing, data sharing, data duplication, hierarchical data storage
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要