Effective Single-Step Adversarial Training With Energy-Based Models

IEEE Transactions on Emerging Topics in Computational Intelligence(2024)

引用 0|浏览0
暂无评分
摘要
Adversarial training (AT) is one of the most effective ways against adversarial attacks. However, multi-step AT is time-consuming while single-step AT is ineffective. In this paper, we propose an Energy-AT framework to make single-step AT as effective as multi-step ones, by exploiting the two properties of energy-based models (EBM). First, we utilize the Helmholtz free energy in EBM to push generated examples to be outside of the distribution boundaries of their categories, such that they are more adversarial. Second, we apply an adaptive temperature scheme in EBM to amplify the training gradients of weak adversarial examples targetedly, such that those originally hard-to-learn examples contribute to the robustification of models also. Extensive experiments validate that Energy-AT improves the robustness of models significantly to adversarial attacks in both white-box and black-box settings, and outperforms the state-of-the-art methods.
更多
查看译文
关键词
Adversarial training,adversarial attacks,energy-based models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要