TY - JOUR
T1 - Distribution-based microdata anonymization
AU - Koudas, Nick
AU - Yu, Ting
AU - Srivastava, Divesh
AU - Zhang, Qing
PY - 2009
Y1 - 2009
N2 - Before sharing to support ad hoc aggregate analyses, microdata often need to be anonymized to protect the privacy of individuals. A variety of privacy models have been proposed for microdata anonymization. Many of these models (e.g., t -closeness) essentially require that, after anonymization, groups of sensitive attribute values follow specified distributions. To support such models, in this paper we study the problem of transforming a group of sensitive attribute values to follow a certain target distribution with minimal data distortion. Specifically, we develop and evaluate a novel methodology that combines the use of sensitive attribute permutation and generalization with the addition of fake sensitive attribute values to achieve this transformation. We identify metrics related to accuracy of aggregate query answers over the transformed data, and develop efficient anonymization algorithms to optimize these accuracy metrics. Using a variety of data sets, we experimentally demonstrate the effectiveness of our techniques.
AB - Before sharing to support ad hoc aggregate analyses, microdata often need to be anonymized to protect the privacy of individuals. A variety of privacy models have been proposed for microdata anonymization. Many of these models (e.g., t -closeness) essentially require that, after anonymization, groups of sensitive attribute values follow specified distributions. To support such models, in this paper we study the problem of transforming a group of sensitive attribute values to follow a certain target distribution with minimal data distortion. Specifically, we develop and evaluate a novel methodology that combines the use of sensitive attribute permutation and generalization with the addition of fake sensitive attribute values to achieve this transformation. We identify metrics related to accuracy of aggregate query answers over the transformed data, and develop efficient anonymization algorithms to optimize these accuracy metrics. Using a variety of data sets, we experimentally demonstrate the effectiveness of our techniques.
UR - http://www.scopus.com/inward/record.url?scp=78649333205&partnerID=8YFLogxK
U2 - 10.14778/1687627.1687735
DO - 10.14778/1687627.1687735
M3 - Article
AN - SCOPUS:78649333205
SN - 2150-8097
VL - 2
SP - 958
EP - 969
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 1
ER -