TY - JOUR
T1 - Skellam Mixture Mechanism
T2 - 48th International Conference on Very Large Data Bases, VLDB 2022
AU - Bao, Ergute
AU - Zhu, Yizheng
AU - Xiao, Xiaokui
AU - Yang, Yin
AU - Ooi, Beng Chin
AU - Tan, Benjamin Hong Meng
AU - Aung, Khin Mi Mi
N1 - Publisher Copyright:
© 2022, VLDB Endowment. All rights reserved.
PY - 2022/7
Y1 - 2022/7
N2 - Deep neural networks have strong capabilities of memorizing the underlying training data, which can be a serious privacy concern. An effective solution to this problem is to train models with differential privacy (DP), which provides rigorous privacy guarantees by injecting random noise to the gradients. This paper focuses on the scenario where sensitive data are distributed among multiple participants, who jointly train a model through federated learning, using both secure multiparty computation (MPC) to ensure the confidentiality of each gradient update, and differential privacy to avoid data leakage in the resulting model. A major challenge in this setting is that common mechanisms for enforcing DP in deep learning, which inject real-valued noise, are fundamentally incompatible with MPC, which exchanges finite-field integers among the participants. Consequently, most existing DP mechanisms require rather high noise levels, leading to poor model utility. Motivated by this, we propose Skellam mixture mechanism (SMM), a novel approach to enforcing DP on models built via federated learning. Compared to existing methods, SMM eliminates the assumption that the input gradients must be integer-valued, and, thus, reduces the amount of noise injected to preserve DP. The theoretical analysis of SMM is highly non-trivial, especially considering (i) the complicated math of DP deep learning in general and (ii) the fact that the mixture of two Skellam distributions is rather complex. Extensive experiments on various practical settings demonstrate that SMM consistently and significantly outperforms existing solutions in terms of the utility of the resulting model.
AB - Deep neural networks have strong capabilities of memorizing the underlying training data, which can be a serious privacy concern. An effective solution to this problem is to train models with differential privacy (DP), which provides rigorous privacy guarantees by injecting random noise to the gradients. This paper focuses on the scenario where sensitive data are distributed among multiple participants, who jointly train a model through federated learning, using both secure multiparty computation (MPC) to ensure the confidentiality of each gradient update, and differential privacy to avoid data leakage in the resulting model. A major challenge in this setting is that common mechanisms for enforcing DP in deep learning, which inject real-valued noise, are fundamentally incompatible with MPC, which exchanges finite-field integers among the participants. Consequently, most existing DP mechanisms require rather high noise levels, leading to poor model utility. Motivated by this, we propose Skellam mixture mechanism (SMM), a novel approach to enforcing DP on models built via federated learning. Compared to existing methods, SMM eliminates the assumption that the input gradients must be integer-valued, and, thus, reduces the amount of noise injected to preserve DP. The theoretical analysis of SMM is highly non-trivial, especially considering (i) the complicated math of DP deep learning in general and (ii) the fact that the mixture of two Skellam distributions is rather complex. Extensive experiments on various practical settings demonstrate that SMM consistently and significantly outperforms existing solutions in terms of the utility of the resulting model.
UR - http://www.scopus.com/inward/record.url?scp=85137984039&partnerID=8YFLogxK
U2 - 10.14778/3551793.3551798
DO - 10.14778/3551793.3551798
M3 - Article
AN - SCOPUS:85137984039
SN - 2150-8097
VL - 15
SP - 2348
EP - 2360
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 11
Y2 - 5 September 2022 through 9 September 2022
ER -