TY - GEN
T1 - Privacy-preserving mining of sequential association rules from provenance workflows
AU - Maruseac, Mihai
AU - Ghinita, Gabriel
PY - 2016/3/9
Y1 - 2016/3/9
N2 - Provenance workflows capture movement and transformation of data in complex environments, such as document management in large organizations, content generation and sharing in in social media, scientific computations, etc. Sharing and processing of provenance workflows brings numerous benefits, e.g., improving productivity in an organization, understanding social media interaction patterns, etc. However, directly sharing provenance may also disclose sensitive information such as confidential business practices, or private details about participants in a social network. We propose an algorithm that privately extracts sequential association rules from provenance workflow datasets. Finding such rules has numerous practical applications, such as capacity planning or identifying hot-spots in provenance graphs. Our approach provides good accuracy and strong privacy, by leveraging on the exponential mechanism of differential privacy. We propose an heuristic that identifies promising candidate rules and makes judicious use of the privacy budget. Experimental results show that the our approach is fast and accurate, and clearly outperforms the state-of-the-art. We also identify influential factors in improving accuracy, which helps in choosing promising directions for future improvement.
AB - Provenance workflows capture movement and transformation of data in complex environments, such as document management in large organizations, content generation and sharing in in social media, scientific computations, etc. Sharing and processing of provenance workflows brings numerous benefits, e.g., improving productivity in an organization, understanding social media interaction patterns, etc. However, directly sharing provenance may also disclose sensitive information such as confidential business practices, or private details about participants in a social network. We propose an algorithm that privately extracts sequential association rules from provenance workflow datasets. Finding such rules has numerous practical applications, such as capacity planning or identifying hot-spots in provenance graphs. Our approach provides good accuracy and strong privacy, by leveraging on the exponential mechanism of differential privacy. We propose an heuristic that identifies promising candidate rules and makes judicious use of the privacy budget. Experimental results show that the our approach is fast and accurate, and clearly outperforms the state-of-the-art. We also identify influential factors in improving accuracy, which helps in choosing promising directions for future improvement.
UR - http://www.scopus.com/inward/record.url?scp=84964816350&partnerID=8YFLogxK
U2 - 10.1145/2857705.2857743
DO - 10.1145/2857705.2857743
M3 - Conference contribution
AN - SCOPUS:84964816350
T3 - CODASPY 2016 - Proceedings of the 6th ACM Conference on Data and Application Security and Privacy
SP - 127
EP - 129
BT - CODASPY 2016 - Proceedings of the 6th ACM Conference on Data and Application Security and Privacy
PB - Association for Computing Machinery, Inc
T2 - 6th ACM Conference on Data and Application Security and Privacy, CODASPY 2016
Y2 - 9 March 2016 through 11 March 2016
ER -