1. SPSC: Stream Processing Framework Atop Serverless Computing for Industrial Big Data
- Author
-
Cai, Zinuo, Chen, Zebin, Chen, Xinglei, Ma, Ruhui, Guan, Haibing, and Buyya, Rajkumar
- Abstract
With the advance of smart manufacturing and information technologies, the volume of data to process is increasing accordingly. Current solutions for big data processing resort to distributed stream processing systems, such as Apache Flink and Spark. However, such frameworks face challenges of resource underutilization and high latency in big data application scenarios. In this article, we propose SPSC, a serverless-based stream computing framework where events are discretized into the atomic stream and stateless Lambda functions are taken as context-irrelevant operators, achieving task parallelism and inherent data parallelism in processing. Also, we implement a prototype of the framework on Amazon Web service (AWS) using AWS Lambda, AWS simple queue service, and AWS DynamoDB. The evaluation shows that compared with Alibaba’s real-time computing Flink version, SPSC outperforms by 10.12% when the overhead is close.
- Published
- 2024
- Full Text
- View/download PDF