1. Just-in-time Data Distribution for Analytical Query Processing
- Author
-
Ivanova, M.G. (Milena), Kersten, M.L. (Martin), Groffen, F.E. (Fabian), Ivanova, M.G. (Milena), Kersten, M.L. (Martin), and Groffen, F.E. (Fabian)
- Abstract
Distributed processing commonly requires data spread across machines using a priori static or hash-based data allocation. In this paper, we explore an alternative approach that starts from a master node in control of the complete database, and a variable number of worker nodes for delegated query processing. Data is shipped just-in-time to the worker nodes using a need to know policy, and is being reused, if possible, in subsequent queries. A bidding mechanism among the workers yields a scheduling with the most efficient reuse of previously shipped data, minimizing the data transfer costs. Just-in-time data shipment allows our system to benefit from locally available idle resources to boost overall performance. The system is maintenance-free and allocation is fully transparent to users. Our experiments show that the proposed adaptive distributed architecture is a viable and flexible alternative for small scale MapReduce-type of settings.
- Published
- 2012