THOR: Text to Human-Object Interaction Diffusion via Relation Intervention
CoRR(2024)
摘要
This paper addresses new methodologies to deal with the challenging task of
generating dynamic Human-Object Interactions from textual descriptions
(Text2HOI). While most existing works assume interactions with limited body
parts or static objects, our task involves addressing the variation in human
motion, the diversity of object shapes, and the semantic vagueness of object
motion simultaneously. To tackle this, we propose a novel Text-guided
Human-Object Interaction diffusion model with Relation Intervention (THOR).
THOR is a cohesive diffusion model equipped with a relation intervention
mechanism. In each diffusion step, we initiate text-guided human and object
motion and then leverage human-object relations to intervene in object motion.
This intervention enhances the spatial-temporal relations between humans and
objects, with human-centric interaction representation providing additional
guidance for synthesizing consistent motion from text. To achieve more
reasonable and realistic results, interaction losses is introduced at different
levels of motion granularity. Moreover, we construct Text-BEHAVE, a Text2HOI
dataset that seamlessly integrates textual descriptions with the currently
largest publicly available 3D HOI dataset. Both quantitative and qualitative
experiments demonstrate the effectiveness of our proposed model.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要