Efficient 3D-Aware Facial Image Editing via Attribute-Specific Prompt Learning
CoRR(2024)
摘要
Drawing upon StyleGAN's expressivity and disentangled latent space, existing
2D approaches employ textual prompting to edit facial images with different
attributes. In contrast, 3D-aware approaches that generate faces at different
target poses require attribute-specific classifiers, learning separate model
weights for each attribute, and are not scalable for novel attributes. In this
work, we propose an efficient, plug-and-play, 3D-aware face editing framework
based on attribute-specific prompt learning, enabling the generation of facial
images with controllable attributes across various target poses. To this end,
we introduce a text-driven learnable style token-based latent attribute editor
(LAE). The LAE harnesses a pre-trained vision-language model to find
text-guided attribute-specific editing direction in the latent space of any
pre-trained 3D-aware GAN. It utilizes learnable style tokens and style mappers
to learn and transform this editing direction to 3D latent space. To train LAE
with multiple attributes, we use directional contrastive loss and style token
loss. Furthermore, to ensure view consistency and identity preservation across
different poses and attributes, we employ several 3D-aware identity and pose
preservation losses. Our experiments show that our proposed framework generates
high-quality images with 3D awareness and view consistency while maintaining
attribute-specific features. We demonstrate the effectiveness of our method on
different facial attributes, including hair color and style, expression, and
others. Code:
https://github.com/VIROBO-15/Efficient-3D-Aware-Facial-Image-Editing.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要