1. Spark 异构集群负载均衡调度策略.
- Author
-
陶宇炜 and 谢爱娟
- Subjects
- *
DYNAMIC loads , *DEAD loads (Mechanics) , *HETEROGENEOUS computing , *DATA distribution , *SCHEDULING - Abstract
Aiming at the problem that the Spark scalable distributed platform does not consider the computing capabilities of heterogeneous cluster nodes and load balance during job task scheduling, which affects the system performance, this paper constructs heterogeneous cluster nodes load balance scheduling policy under the Spark environment, Heterogeneous cluster node predicts the data distribution characteristics according to the sampling algorithm. divides the data into balancing partitions. According to the static load and dynamic load weight distribution, heterogeneous cluster node obtains the real-time load, and dynamically schedules job tasks. Finally. Wordcount, TeraSort, and K-means three benchmark tests were used to compare and analyze during heterogeneous cluster operation. Experimental results show that this algorithm can reduce the execution time significantly, and improve the performance of heterogeneous cluster. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF