Secure MLaaS with Temper: Trusted and Efficient Model Partitioning and Enclave Reuse
Machine Learning as a Service (MLaaS) is becoming a highly available and cost-efficient way to embrace machine learning techniques in various domains. But it suffers from data privacy risks as user data must be uploaded to untrusted clouds. We propose a trusted and efficient MLaaS system, Temper, based on secure hardware enclaves such as Intel SGX. Temper significantly improves the performance without sacrificing the data security guarantees or the model inference accuracy. With the two key techniques of enclave reuse and model partitioning, it reduces the enclave initialization and model loading costs, and alleviates the secure paging overheads due to the limited hardware-protected memory capacity in SGX. We also provide rigorous security guarantees for enclave sharing and batched processing, by ensuring stateless, non-interference, and data-oblivious processing and data transfers across model partitions. Temper achieves on average 2.2$\times$ and 1.8$\times$ improvements over the state-of-the-art designs for latency and throughput, respectively, and within 2.1$\times$ slowdown of untrusted native execution. Its distributed paradigm provides a more scalable way for future MLaaS with large models.