Chrome Extension
WeChat Mini Program
Use on ChatGLM

Agent Attention: on the Integration of Softmax and Linear Attention

Computing Research Repository (CoRR)(2024)

Tsinghua University | DAMO Academy | Kuaishou Technology

Cited 111|Views150
Abstract
The attention module is the key component in Transformers. While the global attention mechanism offers high expressiveness, its excessive computational cost restricts its applicability in various scenarios. In this paper, we propose a novel attention paradigm, Agent Attention, to strike a favorable balance between computational efficiency and representation power. Specifically, the Agent Attention, denoted as a quadruple (Q, A, K, V), introduces an additional set of agent tokens A into the conventional attention module. The agent tokens first act as the agent for the query tokens Q to aggregate information from K and V, and then broadcast the information back to Q. Given the number of agent tokens can be designed to be much smaller than the number of query tokens, the agent attention is significantly more efficient than the widely adopted Softmax attention, while preserving global context modelling capability. Interestingly, we show that the proposed agent attention is equivalent to a generalized form of linear attention. Therefore, agent attention seamlessly integrates the powerful Softmax attention and the highly efficient linear attention. Extensive experiments demonstrate the effectiveness of agent attention with various vision Transformers and across diverse vision tasks, including image classification, object detection, semantic segmentation and image generation. Notably, agent attention has shown remarkable performance in high-resolution scenarios, owning to its linear attention nature. For instance, when applied to Stable Diffusion, our agent attention accelerates generation and substantially enhances image generation quality without any additional training. Code is available at https://github.com/LeapLabTHU/Agent-Attention.
More
Translated text
Key words
Attention mechanism,Agent attention,Vision Transformer
PDF
Bibtex
AI Read Science
Video&Figures
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Related Papers

Referring Flexible Image Restoration

Runwei Guan, Rongsheng Hu, Zhuhao Zhou, Tianlang Xue,Ka Lok Man,Jeremy Smith,Eng Gee Lim,Weiping Ding,Yutao Yue
Expert Syst Appl 2025

被引用0

Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper

要点】:本文提出了一种新型注意力机制Agent Attention,通过引入一组小于查询向量的代理向量,平衡计算效率与表征能力,实现全局上下文建模,并证明其是线性注意力的泛化形式。

方法】:Agent Attention通过在传统的(Q, K, V)注意力机制中引入代理向量A,先让代理向量代表查询向量Q收集K和V的信息,再将信息广播回Q,以此降低计算复杂度同时保持全局上下文建模的能力。

实验】:本文在多个视觉任务上,包括图像分类、目标检测、语义分割和图像生成,使用不同的视觉Transformer模型进行了实验验证。特别地,在高分辨率场景下,Agent Attention由于具有线性注意力的特性,表现出了显著的性能提升。实验使用了多种数据集,具体数据集名称在论文中未明确列出。通过应用Agent Attention到Stable Diffusion模型上,加速了图像生成过程并提高了生成质量,且无需额外训练。代码已开源在https://github.com/LeapLabTHU/Agent-Attention。