Back to Search
Start Over
PPMGS: An efficient and effective solution for distributed privacy-preserving semi-supervised learning.
- Source :
-
Information Sciences . Sep2024, Vol. 678, pN.PAG-N.PAG. 1p. - Publication Year :
- 2024
-
Abstract
- Recently, distributed semi-supervised learning has attracted increasing research attention due to its tremendous practical value. A promising distributed semi-supervised learning method should not only achieve desirable classification performance but also protect data privacy in distributed scenarios. Existing approaches typically capture the similarities between data instances with privacy-preserving computations. This paradigm introduces extra computation and heuristic changes to the algorithm, resulting in sub-optimal solutions that are time-consuming. In current distributed semi-supervised learning, instance similarities are widely used to capture the underlying manifold or guide label propagation. This paper emphasizes that instance similarities are not necessary because the structure of data connections can be estimated using coarser-grained information. We propose a Privacy-preserving Mixture-distribution based Graph Smoothing (PPMGS) model for distributed privacy-preserving semi-supervised learning. Our motivation is to construct a graph based on a Gaussian mixture distribution instead of individual data instances, which better captures the underlying data distribution and improves model efficiency. PPMGS includes a privacy-preserving expectation-maximization (EM) phase to estimate the Gaussian mixture distribution depicting the input data and a mixture-distribution-based graph smoothing algorithm to learn a distribution-based classifier by fitting a few labeled samples. Experimental results show that the proposed PPMGS achieves 5%-10% higher accuracy and macro-F1 than state-of-the-art privacy-preserving semi-supervised learning methods. In terms of efficiency, it reduces time cost by 97% and communication cost by 96% in the most complex dataset. The numerical results demonstrate that our proposal outperforms state-of-the-art baselines in both efficiency and effectiveness. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 00200255
- Volume :
- 678
- Database :
- Academic Search Index
- Journal :
- Information Sciences
- Publication Type :
- Periodical
- Accession number :
- 178148228
- Full Text :
- https://doi.org/10.1016/j.ins.2024.120934