Bringing Textual Prompt to AI-Generated Image Quality Assessment

Bowen Qu, Haohui Li,Wei Gao

2024 IEEE International Conference on Multimedia and Expo (ICME)(2024)

引用 0|浏览9
暂无评分
摘要
AI-Generated Images (AGIs) have inherent multimodal nature. Unliketraditional image quality assessment (IQA) on natural scenarios, AGIs qualityassessment (AGIQA) takes the correspondence of image and its textual promptinto consideration. This is coupled in the ground truth score, which confusesthe unimodal IQA methods. To solve this problem, we introduce IP-IQA (AGIsQuality Assessment via Image and Prompt), a multimodal framework for AGIQA viacorresponding image and prompt incorporation. Specifically, we propose a novelincremental pretraining task named Image2Prompt for better understanding ofAGIs and their corresponding textual prompts. An effective and efficientimage-prompt fusion module, along with a novel special [QA] token, are alsoapplied. Both are plug-and-play and beneficial for the cooperation of image andits corresponding prompt. Experiments demonstrate that our IP-IQA achieves thestate-of-the-art on AGIQA-1k and AGIQA-3k datasets. Code will be available.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要