1. Lock-free Data Structures for Data Stream Processing
- Author
-
Constantin Pohl and Alexander Baumstark
- Subjects
020203 distributed computing ,Computer science ,Distributed computing ,Design elements and principles ,020207 software engineering ,02 engineering and technology ,Data structure ,Stream processing ,Shared memory ,Multithreading ,0202 electrical engineering, electronic engineering, information engineering ,Non-blocking algorithm ,Tuple ,Data stream processing - Abstract
Processing data in real-time instead of storing and reading from tables has led to a specialization of DBMS into the so-called data stream processing paradigm. While high throughput and low latency are key requirements to keep up with varying stream behavior and to allow fast reaction to incoming events, there are many possibilities how to achieve them. In combination with modern hardware, like server CPUs with tens of cores, the parallelization of stream queries for multithreading and vectorization is a common schema. High degrees of parallelism, however, need efficient synchronization mechanisms to allow good scaling with threads for shared memory access.In this work, we identify the most time-consuming operations for stream processing exemplarily for our own stream processing engine PipeFabric. In addition, we present different design principles of lock-free data structures which are suited to overcome those bottlenecks. We will finally demonstrate how lock-freedom greatly improves performance for join processing and tuple exchange between operators under different workloads. Nevertheless, the efficient usage of lock-free data structures comes with additional efforts and pitfalls, which we also discuss in this paper.
- Published
- 2019