RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation
arxiv(2024)
摘要
We present RiEMann, an end-to-end near Real-time SE(3)-Equivariant Robot
Manipulation imitation learning framework from scene point cloud input.
Compared to previous methods that rely on descriptor field matching, RiEMann
directly predicts the target poses of objects for manipulation without any
object segmentation. RiEMann learns a manipulation task from scratch with 5 to
10 demonstrations, generalizes to unseen SE(3) transformations and instances of
target objects, resists visual interference of distracting objects, and follows
the near real-time pose change of the target object. The scalable action space
of RiEMann facilitates the addition of custom equivariant actions such as the
direction of turning the faucet, which makes articulated object manipulation
possible for RiEMann. In simulation and real-world 6-DOF robot manipulation
experiments, we test RiEMann on 5 categories of manipulation tasks with a total
of 25 variants and show that RiEMann outperforms baselines in both task success
rates and SE(3) geodesic distance errors on predicted poses (reduced by 68.6
and achieves a 5.4 frames per second (FPS) network inference speed. Code and
video results are available at https://riemann-web.github.io/.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要