Efficient GPU Implementation of Automatic Differentiation for Computational Fluid Dynamics.

International Conference on High Performance Computing, Data, and Analytics(2023)

引用 0|浏览0
暂无评分
摘要
Many scientific and engineering applications require repeated calculations of derivatives of output functions with respect to input parameters. Automatic Differentiation (AD) is a method that automates derivative calculations and can significantly speed up code development. In Computational Fluid Dynamics (CFD), derivatives of flux functions with respect to state variables (Jacobian) are needed for efficient solutions of the nonlinear governing equations. AD of flux functions on graphics processing units (GPUs) is challenging as flux computations involve many intermediate variables that create high register pressure and require significant memory traffic because of the need to store the derivatives. This paper presents a forward-mode AD method based on multivariate dual numbers that addresses these challenges and simultaneously reduces the floating-point operation count. The dimension of the multivariate dual numbers is optimized for performance. The flux computations are restructured to minimize the number of temporary variables and reduce register pressure. For effective utilization of memory bandwidth, shared memory is used to store the local flux Jaco-bian. This AD implementation is compared with several other Jacobian implementations on an NVIDIA V100 GPU (V100). For three-dimensional perfect-gas compressible-flow equations implemented in a practical CFD code, the AD implementation of a flux Jacobian based on multivariate dual numbers of dimension 5 outperforms all other GPU AD implementations on V100. Its performance is comparable with the optimized hand-differentiated version. The implementation achieves 75% of the peak floating-point throughput and 61 % of the peak global device memory bandwidth usage.
更多
查看译文
关键词
Automatic Differentiation (AD),Multivariate Dual Numbers,Graphics Processing Unit (GPU),Performance Optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要