AHRNET: Attention and Heatmap-Based Regressor for Hand Pose Estimation and Mesh Recovery

Feng Zhou, Pei Shen, Ju Dai,Na Jiang, Yong Hu,Yu-Kun Lai,Paul L. Rosin

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2024)

引用 0|浏览0
Estimating 3D hand pose and recovering the full hand surface mesh from a single RGB image is a challenging task due to self-occlusions, viewpoint changes, and the complexity of hand articulations. In this paper, we propose a novel framework that combines an attention mechanism with heatmap regression to accurately and efficiently predict 3D joint locations and reconstruct the hand mesh. We adopt a pooling attention module that learns to focus on relevant regions in the input image to extract better features for handling occlusions, while greatly reducing the computational cost. The multi-scale 2D heatmaps provide spatial constraints to guide the 3D vertex predictions. By exploiting the complementary strengths of sparse 2D supervision and dense mesh regression, our method accurately reconstructs hand meshes with realistic details. Extensive experiments on standard benchmarks demonstrate that the proposed method efficiently improves the performance of 3D hand pose estimation and mesh recovery. The reproducible recipes are available at https://github.com/SDiannn/AHRNET-Heatmap.
Hand Pose,Mesh Recovery,Deep Learning,Human-computer Interaction,Heatmap
AI 理解论文
Chat Paper