1. Enabling Homomorphically Encrypted Inference for Large DNN Models
- Author
-
Shigeki Tomishima, Marc Jordà, Antonio J. Peña, Harald Servat, Fabian Boemer, Chauhan Chetan, Guillermo Lloret-Talavera, Nilesh N. Shah, and Barcelona Supercomputing Center
- Subjects
FOS: Computer and information sciences ,Information privacy ,Computer Science - Machine Learning ,Programari ,Computer Science - Cryptography and Security ,Computer science ,Distributed computing ,Informàtica::Enginyeria del software [Àrees temàtiques de la UPC] ,Inference ,Privacy-Preserving Machine Learning ,Encryption ,Theoretical Computer Science ,Machine Learning (cs.LG) ,Deep Learning ,Computer Science - Performance ,Artificial neural network ,business.industry ,Deep learning ,Encryption of data (Computer science) ,Homomorphic encryption ,Performance (cs.PF) ,Memory management ,Memory management (Computer science) ,Computational Theory and Mathematics ,Random access memory ,Hardware and Architecture ,Central processing unit ,Artificial intelligence ,Homomorphic Encryption ,business ,Cryptography and Security (cs.CR) ,Neural networks ,Software - Abstract
The proliferation of machine learning services in the last few years has raised data privacy concerns. Homomorphic encryption (HE) enables inference using encrypted data but it incurs 100x-10,000x memory and runtime overheads. Secure deep neural network (DNN) inference using HE is currently limited by computing and memory resources, with frameworks requiring hundreds of gigabytes of DRAM to evaluate small models. To overcome these limitations, in this paper we explore the feasibility of leveraging hybrid memory systems comprised of DRAM and persistent memory. In particular, we explore the recently-released Intel Optane PMem technology and the Intel HE-Transformer nGraph to run large neural networks such as MobileNetV2 (in its largest variant) and ResNet-50 for the first time in the literature. We present an in-depth analysis of the efficiency of the executions with different hardware and software configurations. Our results conclude that DNN inference using HE incurs on friendly access patterns for this memory configuration, yielding efficient executions., Manuscript accepted for publication in IEEE Transactions on Computers
- Published
- 2021