Probe-Me-Not: Protecting Pre-trained Encoders from Malicious Probing.

Ruyi Ding,Tong Zhou,Lili Su,Aidong Adam Ding,Xiaolin Xu,Yunsi Fei

CoRR（2024）

Cited 0|Views1

Abstract

Adapting pre-trained deep learning models to customized tasks has become a popular choice for developers to cope with limited computational resources and data volume. More specifically, probing--training a downstream head on a pre-trained encoder--has been widely adopted in transfer learning, which helps to prevent overfitting and catastrophic forgetting. However, such generalizability of pre-trained encoders raises concerns about the potential misuse of probing for harmful intentions, such as discriminatory speculation and warfare applications. In this work, we introduce EncoderLock, a novel applicability authorization method designed to protect pre-trained encoders from malicious probing, i.e., yielding poor performance on specified prohibited domains while maintaining their utility in authorized ones. Achieving this balance is challenging because of the opposite optimization objectives and the variety of downstream heads that adversaries can utilize adaptively. To address these challenges, EncoderLock employs two techniques: domain-aware weight selection and updating to restrict applications on prohibited domains/tasks, and self-challenging training scheme that iteratively strengthens resistance against any potential downstream classifiers that adversaries may apply. Moreover, recognizing the potential lack of data from prohibited domains in practical scenarios, we introduce three EncoderLock variants with different levels of data accessibility: supervised (prohibited domain data with labels), unsupervised (prohibited domain data without labels), and zero-shot (no data or labels available). We verify EncoderLock's effectiveness and practicality with a real-world pre-trained Vision Transformer (ViT) encoder from Facebook. These results underscore the valuable contributions EncoderLock brings to the development of responsible AI.

Translated text

Bibtex

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Data Disclaimer

The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn

Chat Paper

【要点】：本文提出了一种名为EncoderLock的方法，用于保护预训练编码器免受恶意探测，确保其在授权领域内保持效用，而在禁止领域内表现不佳。

【方法】：EncoderLock采用领域感知的权重选择和更新技术限制禁止领域/任务的应用，以及自我挑战训练方案增强对潜在下游分类器的抵抗力。

【实验】：通过使用Facebook的预训练Vision Transformer（ViT）编码器进行实验，验证了EncoderLock的有效性和实用性，实验考虑了不同数据可访问性水平（监督、无监督和零样本）的EncoderLock变体。

去 AI 文献库对话