Back to Search Start Over

Throughput prediction based on ExtraTree for stream processing tasks

Authors :
Askar Hamdulla
Zheng Chu
Jiong Yu
Source :
Computer Science and Information Systems. 18:1-22
Publication Year :
2021
Publisher :
National Library of Serbia, 2021.

Abstract

In the era of big data, as the amount of streaming data continues to increase, stream processing tasks (SPTs) face serious challenges in real-time processing scenarios with low latency and high throughput. However, much of the current literature on the performance of SPTs pays attention to the reactive approach, which cannot well avoid the problem of system crashes due to the inherent performance volatility. In this paper, a novel throughput prediction method based on ExtraTree for SPTs is presented to address these challenges. A volatility detection algorithm was proposed to obtain the reasonable metric values after the performance volatility of SPTs was studied. Moreover, a selection algorithm of regression function was proposed to output the performance values of SPTs under a relative stead state. Furthermore, a ExtraTree-based algorithm was proposed to predict the throughput of SPTs. The experimental results from two open-source benchmarks running on Apache Flink, a popular stream processing system (SPS), indicated that the average of the accuracy and efficiency of the proposed method could achieve 90.535% and 0.835 s/10,000 samples, which proved the effectiveness of the proposed method on the task of predicting the throughput of SPTs.

Details

ISSN :
24061018 and 18200214
Volume :
18
Database :
OpenAIRE
Journal :
Computer Science and Information Systems
Accession number :
edsair.doi...........19f3599f800c550b67add2e3889c34c8
Full Text :
https://doi.org/10.2298/csis200131031c