Are Aligned Neural Networks Adversarially Aligned?Nicholas Carlini,Milad Nasr,Christopher A. Choquette-Choo,Matthew Jagielski,Irena Gao,Pang Wei Koh,Daphne Ippolito,Florian Tramèr,Ludwig SchmidtarXiv (Cornell University)(2023)引用 217|浏览472暂无评分关键词Adversarial examples,large language models,alignmentAI 理解论文溯源树样例生成溯源树,研究论文发展脉络Chat Paper正在生成论文摘要