NUPAR: A Benchmark Suite for Modern GPU Architectures.

ICPE(2015)

引用 22|浏览111
暂无评分
摘要
ABSTRACTHeterogeneous systems consisting of multi-core CPUs, Graphics Processing Units (GPUs) and many-core accelerators have gained widespread use by application developers and data-center platform developers. Modern day heterogeneous systems have evolved to include advanced hardware and software features to support a spectrum of application patterns. Heterogeneous programming frameworks such as CUDA, OpenCL, and OpenACC have all introduced new interfaces to enable developers to utilize new features on these platforms. In emerging applications, performance optimization is not only limited to effectively exploiting data-level parallelism, but includes leveraging new degrees of concurrency and parallelism to accelerate the entire application. To aid hardware architects and application developers in effectively tuning performance on GPUs, we have developed the NUPAR benchmark suite. The NUPAR applications belong to a number of different scientific and commercial computing domains. These benchmarks exhibit a range of GPU computing characteristics that consider memory-bandwidth limitations, device occupancy and resource utilization, synchronization latency and device-specific compute optimizations. The NUPAR applications are specifically designed to stress new hardware and software features that include: nested parallelism, concurrent kernel execution, shared host-device memory and new instructions for precise computation and data movement. In this paper, we focus our discussion on applications developed in CUDA and OpenCL, and focus on high-end server class GPUs. We describe these benchmarks and evaluate their interaction with different architectural features on a GPU. Our evaluation examines the behavior of the advanced hardware features on recently-released GPU architectures.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要