Back to Search Start Over

Differential Privacy High-Dimensional Data Publishing Based on Feature Selection and Clustering.

Authors :
Chu, Zhiguang
He, Jingsha
Zhang, Xiaolei
Zhang, Xing
Zhu, Nafei
Source :
Electronics (2079-9292); May2023, Vol. 12 Issue 9, p1959, 16p
Publication Year :
2023

Abstract

As a social information product, the privacy and usability of high-dimensional data are the core issues in the field of privacy protection. Feature selection is a commonly used dimensionality reduction processing technique for high-dimensional data. Some feature selection methods only process some of the features selected by the algorithm and do not take into account the information associated with the selected features, resulting in the usability of the final experimental results not being high. This paper proposes a hybrid method based on feature selection and a cluster analysis to solve the data utility and privacy problems of high-dimensional data in the actual publishing process. The proposed method is divided into three stages: (1) screening features; (2) analyzing the clustering of features; and (3) adaptive noise. This paper uses the Wisconsin Breast Cancer Diagnostic (WDBC) database from UCI's Machine Learning Library. Using classification accuracy to evaluate the performance of the proposed method, the experiments show that the original data are processed by the algorithm in this paper while protecting the sensitive data information while retaining the contribution of the data to the diagnostic results. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
20799292
Volume :
12
Issue :
9
Database :
Complementary Index
Journal :
Electronics (2079-9292)
Publication Type :
Academic Journal
Accession number :
163684162
Full Text :
https://doi.org/10.3390/electronics12091959