Point Spatio-Temporal Pyramid Network for Point Cloud Video Understanding

IEEE Signal Processing Letters(2024)

引用 0|浏览10
暂无评分
摘要
The robustness to spatio-temporal sampling is significant for point cloud video understanding. Previous works overlook this issue and usually suffer notable performance drops when point densities and frame rates are changed. To remedy this, we propose a point spatio-temporal pyramid (PoST-Py) to improve the sampling robustness of point cloud video modeling. Specifically, we propose a pluggable PoST-Py to collect multi-scale feature maps from different layers of the backbone. Then, these features are integrated into a unified representation. This allows the model to capture multi-scale spatio-temporal information simultaneously. In addition, we employ the temporal cardinality difference to enhance the features to capture motion information. Extensive experiments show that PoST-Py achieves state-of-the-art performance, particularly with a notable improvement of over 2% under varying point sampling. This demonstrates the improved robustness of our method. The code is available at https://github.com/JohnsonSign/PoST-Py .
更多
查看译文
关键词
Point cloud videos,spatio-temporal pyramid
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要