New algorithm reveals how private privacy-preserving computations really are

The new algorithm is already being taken into use in a dedicated open source library released by Google and in an open source library developed by FCAI researchers.

open source .jpg

Finnish Center for Artificial Intelligence FCAI researchers at the University of Helsinki and Aalto University have figured out very accurately how much privacy data subjects retain when their data are used to train a privacy-preserving machine learning model such as a neural network.

The new algorithm is based on privacy framework called differential privacy which was shortlisted by MIT Technology Review in 2020 as a technology that will change the way we live.

“Differential privacy is used, among others, to guarantee that AI systems developed by Google and Apple using sensitive user data cannot reveal that sensitive data, as well as to guarantee the privacy of data released by US Census 2020”, says Antti Honkela, Associate professor at the University of Helsinki.

The new algorithm allows a more accurate estimation of the retained privacy in these and many other kinds of analyses.

New privacy accountant provides almost perfect accuracy

Differentially private algorithms randomly perturb computations to preserve privacy. FCAI researchers have succeeded, for the first time, to accurately quantify the protection of privacy that these perturbations provide even in very complex algorithms.

“The larger the perturbation, the stronger the privacy claims typically become, but estimating how private a particular algorithm really is can be difficult. Estimating the precise privacy loss is especially difficult for complex algorithms such as neural network training, and this requires using a so-called privacy accountant”, says Antti Koskela, Postdoctoral researcher at the University of Helsinki.

FCAI researchers have developed a new privacy accountant that provides almost perfect accuracy. The method comes with provable upper and lower bounds on the true privacy loss. 

“The new algorithm enables the proving of stronger privacy bounds for the same computation. Conversely, one can reduce the magnitude of the random perturbations to obtain more accurate results under equal privacy guarantees than before”, says Koskela.

The results will enable, for example, training machine learning models in a way that each individual's vulnerability to a privacy breach will be known precisely. This will have a huge impact on making machine learning and AI more trustworthy.

The new algorithm provides essentially the best obtainable measure of the retained privacy under the differential privacy formalism. Recent research suggests this is an accurate measure of how much private information a very powerful adversary could obtain from published results. More research will be needed to extend these possibly pessimistic worst case estimates to different and more realistic scenarios. Moreover, using the new, more accurate privacy bounds, different privacy-preserving machine learning algorithms can be compared more accurately because the accuracy of privacy bounds will no longer impact the comparisons.

The work will be published at the International Conference on Artificial Intelligence and Statistics (AISTATS) in April 2021.

Article:
Antti Koskela, Joonas Jälkö, Lukas Prediger, Antti Honkela: Tight Differential Privacy for Discrete-Valued Mechanisms and for the Subsampled Gaussian Mechanism Using FFT. Proceedings of The 24th International Conference on Artificial Intelligence and Statistics.

More information:
Antti Honkela antti.honkela@helsinki.fi
Tel. +358 50 311 2483
Twitter: @ahonkela

Antti Koskela antti.h.koskela@helsinki.fi