Intrinsic Biologically Plausible Adversarial Training
CoRR(2023)
摘要
Artificial Neural Networks (ANNs) trained with Backpropagation (BP) excel in
different daily tasks but have a dangerous vulnerability: inputs with small
targeted perturbations, also known as adversarial samples, can drastically
disrupt their performance. Adversarial training, a technique in which the
training dataset is augmented with exemplary adversarial samples, is proven to
mitigate this problem but comes at a high computational cost. In contrast to
ANNs, humans are not susceptible to misclassifying these same adversarial
samples, so one can postulate that biologically-plausible trained ANNs might be
more robust against adversarial attacks. Choosing as a case study the
biologically-plausible learning algorithm Present the Error to Perturb the
Input To modulate Activity (PEPITA), we investigate this question through a
comparative analysis with BP-trained ANNs on various computer vision tasks. We
observe that PEPITA has a higher intrinsic adversarial robustness and, when
adversarially trained, has a more favorable natural-vs-adversarial performance
trade-off since, for the same natural accuracies, PEPITA's adversarial
accuracies decrease in average only by 0.26
更多查看译文
关键词
training
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要