Stylus: Automatic Adapter Selection for Diffusion Models
arxiv(2024)
摘要
Beyond scaling base models with more data or parameters, fine-tuned adapters
provide an alternative way to generate high fidelity, custom images at reduced
costs. As such, adapters have been widely adopted by open-source communities,
accumulating a database of over 100K adapters-most of which are highly
customized with insufficient descriptions. This paper explores the problem of
matching the prompt to a set of relevant adapters, built on recent work that
highlight the performance gains of composing adapters. We introduce Stylus,
which efficiently selects and automatically composes task-specific adapters
based on a prompt's keywords. Stylus outlines a three-stage approach that first
summarizes adapters with improved descriptions and embeddings, retrieves
relevant adapters, and then further assembles adapters based on prompts'
keywords by checking how well they fit the prompt. To evaluate Stylus, we
developed StylusDocs, a curated dataset featuring 75K adapters with
pre-computed adapter embeddings. In our evaluation on popular Stable Diffusion
checkpoints, Stylus achieves greater CLIP-FID Pareto efficiency and is twice as
preferred, with humans and multimodal models as evaluators, over the base
model. See stylus-diffusion.github.io for more.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要