1. Improved algorithms for intermediate dataset storage in a cloud-based dataflow.
- Author
-
Cheng, Jie, Zhu, Daming, and Zhu, Binhai
- Subjects
- *
BIG data , *ALGORITHMS , *COMPUTATIONAL complexity , *ELECTRONIC data processing , *MACHINE theory - Abstract
In order to run a dataflow with as low cost as possible, it is often faced with deciding which data-sets in a data-set sequence should be stored, with the rest regenerated. The Intermediate Data-set Storage problem arises from this situation. The current best algorithm for this problem takes O ( n 4 ) time. In this paper, we present two improved algorithms for this problem, the first of which can achieve a time complexity O ( n 2 ) , the second of which O ( r n ) , where n is the number of data-sets in a dataflow, r is a numerical number which indicates how large it is for the maximum storage cost to be divided by the minimum computation cost in the dataflow. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF