Going fast on a small-size computing cluster

Niclas Steve Eich,Martin Erdmann, Svenja Diekmann,Manfred Peter Fackeldey,Benjamin Fischer,Dennis Noll, Yannik Alexander Rath

Journal of Physics: Conference Series(2023)

引用 0|浏览3
暂无评分
摘要
Fast turnaround times for LHC physics analyses are essential for scientific success. The ability to quickly perform optimizations and consolidation studies is critical. At the same time, computing demands and complexities are rising with the upcoming data taking periods and new technologies, such as deep learning. We present a show-case of the HH -> bbWW analysis at the CMS experiment, where we process O(1 - 10)TB of data on 100 threads in a few hours. This analysis is based on the columnar NanoAOD data format, makes use of the NumPy ecosystem and HEP specific tools, in particular Coffea and Dask. Data locality, especially IO latency, is optimized by employing a multi-level caching structure using local file storage and on-worker SSD caches. We process thousands of events simultaneously within a single thread, thus enabling straightforward use of vectorized operations. Resource intensive computing tasks, such as GPU accelerated DNN inference and histogram aggregation in the O(10)GB regime, are offloaded to dedicated workers. The analysis consists of hundreds of distinctly different workloads and is steered through a workflow management tool ensuring reproducibility throughout the development process up to journal publication.
更多
查看译文
关键词
cluster,computing,fast,small-size
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要