Back to Search Start Over

TERMS: Task management policies to achieve high performance for mixed workloads using surplus resources.

Authors :
Yu, Jinyu
Tong, Wei
Lv, Pengze
Feng, Dan
Source :
Journal of Parallel & Distributed Computing. Dec2022, Vol. 170, p74-85. 12p.
Publication Year :
2022

Abstract

Resource contentions and performance interferences can lead to workload performance degradation in mixed-workload deployment clusters. Previous work guarantees the resource requirements of latency-sensitive tasks and reduces performance losses to batch jobs by reclaiming surplus resources from over-provisioned tasks. While the fragmentation of resources leads to a mismatch between provisioned resources and task requirements, resulting in high operation overheads and losses of task fairness. This paper proposes TERMS , the task management policies based on task relevance, resource distribution, and task fairness to achieve efficient and low-cost task management. TERMS mainly includes three types of management policies. The task scheduling policy can schedule new tasks according to task relevance. Task selection strategies select tasks for resource provisioning and task resumption based on resource requirements and task fairness. If necessary, the node selection strategy can be used to choose befitting target nodes based on task relevance and node resource information for task migration when eliminating straggler tasks. Evaluation results show that TERMS can further improve the performance of latency-sensitive services and batch jobs, reduce management overheads, and avoid operation failures. • Place latency-sensitive new tasks in the queue according to task relevance. • Select batch tasks to provision resources for latency-sensitive new tasks. • Select batch tasks to provision resources for latency-sensitive straggler tasks. • Ensure task fairness when tasks are preempted or resumed. • Choose befitting target nodes for latency-sensitive straggler task migration. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
07437315
Volume :
170
Database :
Academic Search Index
Journal :
Journal of Parallel & Distributed Computing
Publication Type :
Academic Journal
Accession number :
159292103
Full Text :
https://doi.org/10.1016/j.jpdc.2022.08.005