Multi-View Deep Learning for Consistent Semantic Mapping with RGB-D Cameras

2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS)(2017)

引用 158|浏览217
暂无评分
摘要
Visual scene understanding is an important capability that enables robots to purposefully act in their environment. In this paper, we propose a novel approach to object-class segmentation from multiple RGB-D views using deep learning. We train a deep neural network to predict object-class semantics that is consistent from several view points in a semi-supervised way. At test time, the semantics predictions of our network can be fused more consistently in semantic keyframe maps than predictions of a network trained on individual views. We base our network architecture on a recent single-view deep learning approach to RGB and depth fusion for semantic object-class segmentation and enhance it with multi-scale loss minimization. We obtain the camera trajectory using RGB-D SLAM and warp the predictions of RGB-D images into ground-truth annotated frames in order to enforce multi-view consistency during training. At test time, predictions from multiple views are fused into keyframes. We propose and analyze several methods for enforcing multi-view consistency during training and testing. We evaluate the benefit of multi-view consistency training and demonstrate that pooling of deep features and fusion over multiple views outperforms single-view baselines on the NYUDv2 benchmark for semantic segmentation. Our end-to-end trained network achieves state-of-the-art performance on the NYUDv2 dataset in single-view segmentation as well as multi-view semantic fusion.
更多
查看译文
关键词
single-view baselines,end-to-end trained network,single-view segmentation,multiview semantic fusion,RGB-D cameras,visual scene understanding,deep neural network approach,RGB-D sequences,multiview consistent semantics,semantics predictions,semantic keyframe maps,network architecture,deep learning approach,depth fusion,semantic object-class segmentation,multiscale loss minimization,RGB-D images,multiview consistency training,semantic mapping,camera trajectory,RGB-D SLAM,Multiview deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要