Back to Search
Start Over
Block size, parallelism and predictive performance: finding the sweet spot in distributed learning.
- Source :
-
International Journal of Parallel, Emergent & Distributed Systems . May2024, Vol. 39 Issue 3, p379-398. 20p. - Publication Year :
- 2024
-
Abstract
- As distributed and multi-organization Machine Learning emerges, new challenges must be solved, such as diverse and low-quality data or real-time delivery. In this paper, we use a distributed learning environment to analyze the relationship between block size, parallelism, and predictor quality. Specifically, the goal is to find the optimum block size and the best heuristic to create distributed Ensembles. We evaluated three different heuristics and five block sizes on four publicly available datasets. Results show that using fewer but better base models matches or outperforms a standard Random Forest, and that 32 MB is the best block size. [ABSTRACT FROM AUTHOR]
- Subjects :
- *MACHINE learning
*RANDOM forest algorithms
*CLASSROOM environment
Subjects
Details
- Language :
- English
- ISSN :
- 17445760
- Volume :
- 39
- Issue :
- 3
- Database :
- Academic Search Index
- Journal :
- International Journal of Parallel, Emergent & Distributed Systems
- Publication Type :
- Academic Journal
- Accession number :
- 176614513
- Full Text :
- https://doi.org/10.1080/17445760.2023.2225854