DiffPop: Plausibility-Guided Object Placement Diffusion for Image Composition
CoRR(2024)
摘要
In this paper, we address the problem of plausible object placement for the
challenging task of realistic image composition. We propose DiffPop, the first
framework that utilizes plausibility-guided denoising diffusion probabilistic
model to learn the scale and spatial relations among multiple objects and the
corresponding scene image. First, we train an unguided diffusion model to
directly learn the object placement parameters in a self-supervised manner.
Then, we develop a human-in-the-loop pipeline which exploits human labeling on
the diffusion-generated composite images to provide the weak supervision for
training a structural plausibility classifier. The classifier is further used
to guide the diffusion sampling process towards generating the plausible object
placement. Experimental results verify the superiority of our method for
producing plausible and diverse composite images on the new Cityscapes-OP
dataset and the public OPA dataset, as well as demonstrate its potential in
applications such as data augmentation and multi-object placement tasks. Our
dataset and code will be released.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要