Back to Search Start Over

基于 Spark 和 NRSCA 策略的并行深度森林算法.

Authors :
毛伊敏
刘绍芬
Source :
Application Research of Computers / Jisuanji Yingyong Yanjiu. Jan2024, Vol. 41 Issue 1, p126-133. 8p.
Publication Year :
2024

Abstract

This paper proposed a parallel deep forest algorithm based on Spark and NRSCA strategy (PDF-SNRSCA), aiming to address several issues encountered by parallel deep forest algorithms in big data environments, such as excessive redundancy and irrelevant features, low utilization rate of features at both ends, slow model convergence speed, and low parallel efficiency of cascading forests. Firstly, the algorithm proposes a feature selection strategy (FSNRS) based on neighborhood rough sets and fisher score, which measures the correlation and redundancy of features to effectively reduce the number of redundant and irrelevant features. Secondly, it proposed a scanning strategy based on random selection and equidistant extraction (S-RSEE) to ensure that all features are utilized with the same probability and solve the problem of low utilization rate of two ends in multi-scanning. Finally, combining with the Spark framework, the algorithm realized the parallel training of cascading forests, and it proposed a feature filtering mechanism based on the importance index (FFM-II) to balance the dimensions of enhanced class vectors and original class vectors, thereby accelerating the model convergence speed. Meanwhile, the algorithm designed a task scheduling mechanism based on SCA (TSM-SCA) to redistribute tasks and ensure load balancing in the cluster, which solves the problem of low parallel efficiency of cascading forests. Experiments show that the PDF-SNRSCA algorithm can effectively improve the classification performance of deep forests and greatly enhance the efficiency of parallel training of deep forests. [ABSTRACT FROM AUTHOR]

Subjects

Subjects :
*ROUGH sets
*ALGORITHMS

Details

Language :
Chinese
ISSN :
10013695
Volume :
41
Issue :
1
Database :
Academic Search Index
Journal :
Application Research of Computers / Jisuanji Yingyong Yanjiu
Publication Type :
Academic Journal
Accession number :
175061727
Full Text :
https://doi.org/10.19734/j.issn.1001-3695.2023.05.0196