InstanceDiffusion: Instance-level Control for Image Generation
CoRR(2024)
摘要
Text-to-image diffusion models produce high quality images but do not offer
control over individual instances in the image. We introduce InstanceDiffusion
that adds precise instance-level control to text-to-image diffusion models.
InstanceDiffusion supports free-form language conditions per instance and
allows flexible ways to specify instance locations such as simple single
points, scribbles, bounding boxes or intricate instance segmentation masks, and
combinations thereof. We propose three major changes to text-to-image models
that enable precise instance-level control. Our UniFusion block enables
instance-level conditions for text-to-image models, the ScaleU block improves
image fidelity, and our Multi-instance Sampler improves generations for
multiple instances. InstanceDiffusion significantly surpasses specialized
state-of-the-art models for each location condition. Notably, on the COCO
dataset, we outperform previous state-of-the-art by 20.4
for box inputs, and 25.4
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要