Pick-and-Draw: Training-free Semantic Guidance for Text-to-Image Personalization
CoRR(2024)
摘要
Diffusion-based text-to-image personalization have achieved great success in
generating subjects specified by users among various contexts. Even though,
existing finetuning-based methods still suffer from model overfitting, which
greatly harms the generative diversity, especially when given subject images
are few. To this end, we propose Pick-and-Draw, a training-free semantic
guidance approach to boost identity consistency and generative diversity for
personalization methods. Our approach consists of two components: appearance
picking guidance and layout drawing guidance. As for the former, we construct
an appearance palette with visual features from the reference image, where we
pick local patterns for generating the specified subject with consistent
identity. As for layout drawing, we outline the subject's contour by referring
to a generative template from the vanilla diffusion model, and inherit the
strong image prior to synthesize diverse contexts according to different text
conditions. The proposed approach can be applied to any personalized diffusion
models and requires as few as a single reference image. Qualitative and
quantitative experiments show that Pick-and-Draw consistently improves identity
consistency and generative diversity, pushing the trade-off between subject
fidelity and image-text fidelity to a new Pareto frontier.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要