The proliferation of machine learning services in the last few years has
raised data privacy concerns. Homomorphic encryption (HE) enables inference
using encrypted data but it incurs 100x-10,000x memory and runtime overheads.
Secure deep neural network (DNN) inference using HE is currently limited by
computing and memory resources, with frameworks requiring hundreds of gigabytes
of DRAM to evaluate small models. To overcome these limitations, in this paper
we explore the feasibility of leveraging hybrid memory systems comprised of
DRAM and persistent memory. In particular, we explore the recently-released
Intel Optane PMem technology and the Intel HE-Transformer nGraph to run large
neural networks such as MobileNetV2 (in its largest variant) and ResNet-50 for
the first time in the literature. We present an in-depth analysis of the
efficiency of the executions with different hardware and software
configurations. Our results conclude that DNN inference using HE incurs on
friendly access patterns for this memory configuration, yielding efficient

