Li, Zhongmiao, UCL - SST/ICTM/INGI - Pôle en ingénierie informatique, UCL - Ecole Polytechnique de Louvain, Van Roy, Peter, Romano, Paolo, Pecheur, Charles, Riviere, Etienne, Martins, Bruno, Pedone, Fernando, and Hughes, Danny
The last few decades have witnessed the unprecedented growth of large-scale online services. Distributed data storage systems, which are the fundamental building blocks of large-scale online services, are faced with a number of challenging, and often antagonistic, requirements. On the one hand, many distributed data storage systems have shifted away from weak consistency and embraced strong, transactional, semantics in order to tame the ever growing complexity of modern applications. On the other hand, the need for storing sheer amount of data and serving geo-dispersed clients with low latency has driven modern data storage systems to adopt partial replication techniques, often applied to geo-distributed infrastructures. Unfortunately, when employed in the geo-distributed and/or partial replicated settings, state of the art approaches to enforce transactional consistency suffer from severe bottlenecks that strongly hinder their efficiency. This dissertation investigates the use of speculative techniques to enhance performance of partially replicated transactional data stores, with a focus on geo-distributed platforms. With the term speculation, in this dissertation, we refer to the possibility of exposing the updates produced by uncommitted transactions to other transactions and/or to external clients in order to enhance performance. We apply speculation techniques to two fundamental approaches to develop replicated transactional data stores, namely Deferred Update Replication (DUR) and State Machine Replication (SMR). In DUR-based systems, transactions are firstly executed in a node and then propagated to other nodes for a global verification phase, during which pre-commit locks have to be held on data items updated by transactions. The global verification phase can throttle system throughput, especially when there is high conflict. We tackle this problem by introducing Speculative Transaction Replication (STR), a DUR protocol that exploits speculative reads to enhance performance of geo-distributed, partially replicated transactional data stores. The use of speculative reads greatly reduces the ‘effective duration’ of pre-commit locks, thus removing one of the key bottlenecks of DUR-based protocols. However, the indiscriminate use of speculative reads can expose applications to concurrency anomalies that can compromise their correctness in subtle ways. We tackle this issue by introducing Speculative Snapshot Isolation (SPSI), an extension of Snapshot Isolation (SI), which specifies desirable atomicity and isolation guarantees that must hold when using speculative processing techniques. In a nutshell, SPSI guarantees that, applications designed to operate using SI can safely execute atop STR, sheltering programmers from complex concurrency anomalies and source code modification. Our experimental study shows that STR, thanks to the use of speculative reads, yields up to 11× throughput improvements over state-of-the-art approaches that do not adopt speculative techniques. In SMR-based systems, transactions first undergo an ordering phase, then replicas have to guarantee that the result of transaction execution is equivalent to a serial execution according to the produced order from the ordering phase. To ensure this guarantee, existing approaches use a single-thread to execute or serialize transactions, which severely limits throughput especially given the current architectural trend towards massively parallel multi-core processors. This limitation is tackled through the introduction of SPARKLE. SPARKLE is an innovative deterministic concurrency control designed for Partially-Replicated State Machines (PRSMs). SPARKLE untaps the potential parallelism of modern multi-core systems through the use of speculative technique and by avoiding inherently non-scalable designs that rely on a single thread for either executing or scheduling transactions. The key contribution of SPARKLE is a set of techniques that can greatly minimize the frequency of misspeculations and the cost associated with correcting them. Our evaluation shows that SPARKLE achieves up to one order of magnitude throughput gains when compared to state of the art systems. (FSA - Sciences de l'ingénieur) -- UCL, 2020