1. Scheduling Parallel-Task Jobs Subject to Packing and Placement Constraints.
- Author
-
Shafiee, Mehrnoosh and Ghaderi, Javad
- Subjects
APPROXIMATION algorithms ,PARALLEL programming ,SCHEDULING ,PARALLEL processing ,PROBLEM solving - Abstract
Jobs in modern parallel-computing frameworks, such as Hadoop and Spark, are subject to several constraints. In these frameworks, the data are typically distributed across a cluster of machines and is processed in multiple stages. Therefore, tasks that belong to the same stage (job) have a collective completion time that is determined by the slowest task in the collection. Furthermore, a task's processing time is machine dependent, and each machine is capable of processing multiple tasks at a time subject to its capacity. In "Scheduling Parallel-Task Jobs Subject to Packing and Placement Constraints," by Mehrnoosh Shafiee and Javad Ghaderi, multiple approximation algorithms with theoretical guarantees are provided to solve the problem under preemptive and nonpreemptive scenarios. The numerical results, using a real traffic trace, demonstrate that the algorithms yield significant gains over the prior approaches. Motivated by modern parallel computing applications, we consider the problem of scheduling parallel-task jobs with heterogeneous resource requirements in a cluster of machines. Each job consists of a set of tasks that can be processed in parallel; however, the job is considered completed only when all its tasks finish their processing, which we refer to as the synchronization constraint. Furthermore, assignment of tasks to machines is subject to placement constraints, that is, each task can be processed only on a subset of machines, and processing times can also be machine dependent. Once a task is scheduled on a machine, it requires a certain amount of resource from that machine for the duration of its processing. A machine can process (pack) multiple tasks at the same time; however, the cumulative resource requirement of the tasks should not exceed the machine's capacity. Our objective is to minimize the weighted average of the jobs' completion times. The problem, subject to synchronization, packing, and placement constraints, is NP-hard, and prior theoretical results only concern much simpler models. For the case that migration of tasks among the placement-feasible machines is allowed, we propose a preemptive algorithm with an approximation ratio of (6 + ϵ). In the special case that only one machine can process each task, we design an algorithm with an improved approximation ratio of four. Finally, in the case that migrations (and preemptions) are not allowed, we design an algorithm with an approximation ratio of 24. Our algorithms use a combination of linear program relaxation and greedy packing techniques. We present extensive simulation results, using a real traffic trace, that demonstrate that our algorithms yield significant gains over the prior approaches. Funding: This work was supported by the National Science Foundation [Grants CNS-1652115 and CNS-1717867]. Supplemental Material: The online appendices are available at https://doi.org/10.1287/opre.2021.2198. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF