Abstract:

Abstract. Billions of dollars and countless GPU hours are currently spent on training Deep Neural Networks (DNNs) for a variety of tasks. Thus, it is essential to determine the difficulty of stealing such networks (by extracting all their parameters) when given access to their black-box implementations. Many versions of this problem have been studied over the last 30 years, including a deep result by Charles Fefferman from 1994 that the internal parameters of such networks are uniquely determined by their inputs and outputs, in the same way that the coefficients of multivariate polynomials are uniquely determined by their inputs and outputs via interpolation. However, his result does not determine the difficulty of this task, and the best currently known algorithm for extracting all the parameters of a ReLU-based deep neural network was presented at Crypto’20 by Carlini, Jagielski, and Mironov. It resembles a differential chosen plaintext attack on a cryptosystem, which has a secret key embedded in its black-box implementation and requires a polynomial number of queries but an exponential amount of time (as a function of the number of neurons).
In this talk, I will improve this attack by developing several new techniques that make it possible to extract with arbitrarily high precision all the real-valued parameters of a ReLU-based DNN using a polynomial number of queries AND a polynomial amount of time. I will demonstrate its practical efficiency by applying it to a full-sized neural network for classifying the CIFAR10 dataset, which has 3072 inputs, 8 hidden layers with 256 neurons each, and about 1.2 million neuronal parameters. An attack following the approach by Carlini et al. requires an exhaustive search over 2^256 possibilities, whereas the new attack requires only 30 minutes on a 256-core computer.