AQ2PNN: Enabling Two-party Privacy-Preserving Deep Neural Network Inference with Adaptive Quantization

56TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, MICRO 2023(2023)

引用 0|浏览6
暂无评分
摘要
The growing prevalence of Machine Learning as a Service (MLaaS) enables a wide range of applications but simultaneously raises numerous security and privacy concerns. A key issue involves the potential privacy exposure of involved parties, such as the customer's input data and the vendor's model. Consequently, two-party computing (2PC) has emerged as a promising solution to safeguard the privacy of different parties during deep neural network (DNN) inference. However, the state-of-the-art (SOTA) 2PC-DNN techniques are tailored explicitly to traditional instruction set architecture (ISA) systems like CPUs and CPU+GPU. This reliance on ISA systems significantly constrains their energy efficiency, as these architectures typically employ 32- or 64-bit instruction sets. In contrast, the possibilities of harnessing dynamic and adaptive quantization to build high-performance 2PC-DNNs remain largely unexplored due to the lack of compatible algorithms and hardware accelerators. To mitigate the bottleneck of SOTA solutions and fill the existing research gaps, this work investigates the construction of 2PC-DNNs on field programmable gate arrays (FPGAs). We introduce AQ2PNN, an end-to-end framework that effectively employs adaptive quantization schemes to develop high-performance 2PC-DNNs on FPGAs. From an algorithmic perspective, AQ2PNN introduces an innovative 2PC-ReLU method to replace Yao's Garbled Circuits (GC). Regarding hardware, AQ2PNN employs an extensive set of building blocks for linear operators, non-linear operators, and a specialized Oblivious Transfer (OT) module for secure data exchange, respectively. These algorithm-hardware co-designed modules extremely utilize the fine-grained reconfigurability of FPGAs, to adapt the data bit-width of different DNN layers in the ciphertext domain, thereby reducing communication overhead between parties without compromising DNN performance, such as accuracy. We thoroughly assess AQ2PNN using widely adopted DNN architectures, including ResNet18, ResNet50, and VGG16, all trained on ImageNet and producing quantized models. Experimental results demonstrate that AQ2PNN outperforms SOTA solutions, achieving significantly reduced communication overhead by 25%, improved energy efficiency by 26.3x, and comparable or even superior throughput and accuracy.
更多
查看译文
关键词
Privacy-Preserving machine learning,Deep learning,FPGA,Quantization,Two-party computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要