HDPose: Post-Hierarchical Diffusion with Conditioning for 3D Human Pose Estimation

Donghoon Lee,Jaeho Kim

SENSORS(2024)

引用 0|浏览2
暂无评分
摘要
Recently, monocular 3D human pose estimation (HPE) methods were used to accurately predict 3D pose by solving the ill-pose problem caused by 3D-2D projection. However, monocular 3D HPE still remains challenging owing to the inherent depth ambiguity and occlusions. To address this issue, previous studies have proposed diffusion model-based approaches (DDPM) that learn to reconstruct a correct 3D pose from a noisy initial 3D pose. In addition, these approaches use 2D keypoints or context encoders that encode spatial and temporal information to inform the model. However, they often fall short of achieving peak performance, or require an extended period to converge to the target pose. In this paper, we proposed HDPose, which can converge rapidly and predict 3D poses accurately. Our approach aggregated spatial and temporal information from the condition into a denoising model in a hierarchical structure. We observed that the post-hierarchical structure achieved the best performance among various condition structures. Further, we evaluated our model on the widely used Human3.6M and MPI-INF-3DHP datasets. The proposed model demonstrated competitive performance with state-of-the-art models, achieving high accuracy with faster convergence while being considerably more lightweight.
更多
查看译文
关键词
3D human pose estimation,diffusion,transformer,hierarchical structure
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要