Probabilistic dimension reduction method for privacy preserving data clustering

No Thumbnail Available
Date
2019-01-01
Authors
Jalla, Hanumantha Rao
Girija, P. N.
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The frequent use of data mining techniques in business organizations is useful to sustain competition in real world but it leads to violation of privacy of individual customers while publishing original customer’s data into real world. This paper proposes a distance preserving perturbation method for Privacy Preserving Data Mining (PPDM) using t-Stochastic Neighbor Embedding (t-SNE). The t-SNE is mainly used for dimension reduction technique; it reduces higher dimensional data sets into required lower dimensional data sets and maintains same distance in lower dimensional data. We choose K-means algorithm as knowledge-based technique; it works based on distance between data records. Setting perplexity parameter in the proposed method creates complexity to unauthorized persons to convert from low-dimensional data to original data. The proposed method is evaluated using Variation Information (VI) between original and modified data clusters. In this work, the proposed method is applied on various data sets and compared original and modified data clusters through VI.
Description
Keywords
K-means, Perturbation, Privacy, t-SNE
Citation
Advances in Intelligent Systems and Computing. v.813