Back to Search Start Over

Stateful Entities: Object-oriented Cloud Applications as Distributed Dataflows

Authors :
Zorgdrager, Wouter
Psarakis, Kyriakos
Fragkoulis, Marios
Visser, Eelco
Katsifodimos, Asterios
Publication Year :
2021
Publisher :
arXiv, 2021.

Abstract

Programming stateful cloud applications remains a very painful experience. Instead of focusing on the business logic, programmers spend most of their time dealing with distributed systems considerations, with the most important being consistency, load balancing, failure management, recovery, and scalability. At the same time, we witness an unprecedented adoption of modern dataflow systems such as Apache Flink, Google Dataflow, and Timely Dataflow. These systems are now performant and fault-tolerant, and they offer excellent state management primitives. With this line of work, we aim at investigating the opportunities and limits of compiling general-purpose programs into stateful dataflows. Given a set of easy-to-follow code conventions, programmers can author stateful entities, a programming abstraction embedded in Python. We present a compiler pipeline named StateFlow, to analyze the abstract syntax tree of a Python application and rewrite it into an intermediate representation based on stateful dataflow graphs. StateFlow compiles that intermediate representation to a target execution system: Apache Flink and Beam, AWS Lambda, Flink's Statefun, and Cloudburst. Through an experimental evaluation, we demonstrate that the code generated by StateFlow incurs minimal overhead. While developing and deploying our prototype, we came to observe important limitations of current dataflow systems in executing cloud applications at scale.

Details

Database :
OpenAIRE
Accession number :
edsair.doi.dedup.....92ba808ffed7187122f260a336dbb0f9
Full Text :
https://doi.org/10.48550/arxiv.2112.00710