Distributed Cardinality Estimation of Set Operations with Differential Privacy

Rade Stanojevic, Mohamed Nabeel, Ting Yu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

20 Citations (Scopus)

Abstract

In this paper we study the problem of estimating the cardinality of pairwise set operations (union and intersection) over sets possessed by different data owners, while preserving differential privacy. In our problem setting, a data owner could only communicate with an untrusted server, and thus have to perturb its set data for privacy protection before sharing them with the server. This problem setting is relevant to diverse applications in practice, including sensor-based traffic monitoring, cross-domain data integration, and combining findings from multiple surveys. To tackle this problem, we first adopt existing randomized response technique to perturb the bit vector (to achieve differential privacy) and develop tools which the server can use to derive the cardinality of set operations from the randomized bit vectors. However, the variance of the union/intersection estimator grows linearly with the universe (bit-vector) size which is impractical for large universes. To keep the variance low we in addition propose to resort to Bloom filters instead of high-dimensional bit vectors to share set data with the server. The key insight is that in spite of inevitable collisions in BF by keeping its size small we can bound the variance of the union/intersection cardinality estimators. Finally, we show that investing a small part of the privacy budget into reporting (obfuscated) set cardinality can further reduce the estimator errors for up to 20%. Our empirical analysis reveals the impact of various parameters including privacy budget and Bloom filter size on the overall accuracy of the approach and demonstrates the utility of the proposed solution.

Original languageEnglish
Title of host publicationProceedings - 2017 IEEE Symposium on Privacy-Aware Computing, PAC 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages37-48
Number of pages12
ISBN (Electronic)9781538610275
DOIs
Publication statusPublished - 4 Dec 2017
Event1st IEEE Symposium on Privacy-Aware Computing, PAC 2017 - Washington, United States
Duration: 1 Aug 20173 Aug 2017

Publication series

NameProceedings - 2017 IEEE Symposium on Privacy-Aware Computing, PAC 2017
Volume2017-January

Conference

Conference1st IEEE Symposium on Privacy-Aware Computing, PAC 2017
Country/TerritoryUnited States
CityWashington
Period1/08/173/08/17

Fingerprint

Dive into the research topics of 'Distributed Cardinality Estimation of Set Operations with Differential Privacy'. Together they form a unique fingerprint.

Cite this