Back to Search Start Over

Static Worksharing Strategies for Heterogeneous Computers with Unrecoverable Failures

Authors :
Frédéric Vivien
Yves Robert
Arnold L. Rosenberg
Anne Benoit
Laboratoire de l'Informatique du Parallélisme (LIP)
École normale supérieure - Lyon (ENS Lyon)-Université Claude Bernard Lyon 1 (UCBL)
Université de Lyon-Université de Lyon-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)
Algorithms and Scheduling for Distributed Heterogeneous Platforms (GRAAL)
Inria Grenoble - Rhône-Alpes
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire de l'Informatique du Parallélisme (LIP)
Université de Lyon-Université de Lyon-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Lyon (ENS Lyon)-Université Claude Bernard Lyon 1 (UCBL)
Université de Lyon-Université de Lyon-Centre National de la Recherche Scientifique (CNRS)
Lin
Hai-Xiang and Alexander
Michael and Forsell
Martti and Knüpfer
Andreas and Prodan
Radu and Sousa
Leonel and Streit
Achim
École normale supérieure de Lyon (ENS de Lyon)-Université Claude Bernard Lyon 1 (UCBL)
Université de Lyon-Université de Lyon-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure de Lyon (ENS de Lyon)-Université Claude Bernard Lyon 1 (UCBL)
Colorado State University [Fort Collins] (CSU)
Source :
HeteroPar 2009, HeteroPar 2009, Aug 2009, Delft, Netherlands. pp.71-80, ⟨10.1007/978-3-642-14122-5_11⟩, Lecture Notes in Computer Science ISBN: 9783642141218, Euro-Par Workshops
Publication Year :
2009
Publisher :
HAL CCSD, 2009.

Abstract

International audience; One has a large workload that is "divisible" (its constituent work's granularity can be adjusted arbitrarily) and one has access to p remote computers that can assist in computing the workload. How can one best utilize the computers toward this end? Two features complicate this question. First, the remote computers may differ from one another in speed. Second, each remote computer is subject to interruptions of known likelihood that kill all work in progress on it. One wishes to orchestrate sharing the workload with the remote computers in a way that maximizes the expected amount of work completed, given the risk of interruptions. We consider three versions of the preceding problem. Two versions envision heterogeneous computing resources: the remote computers may differ from one another in speed; one version envisions homogeneous computing resources: the remote computers are identical. One of the heterogeneous versions ignores communication costs (i.e., assumes that they are negligible); the other two versions account explicitly for communication costs. We provide exact expressions for the optimal work expectation for all three versions of the problem. For the most general version (heterogeneous resources, with communication costs), we provide a recurrence for computing this expectation; for the other two versions, we provide closed-form expressions.

Details

Language :
English
ISBN :
978-3-642-14121-8
ISBNs :
9783642141218
Database :
OpenAIRE
Journal :
HeteroPar 2009, HeteroPar 2009, Aug 2009, Delft, Netherlands. pp.71-80, ⟨10.1007/978-3-642-14122-5_11⟩, Lecture Notes in Computer Science ISBN: 9783642141218, Euro-Par Workshops
Accession number :
edsair.doi.dedup.....c48e9d71ec6eb6204e528fd5bb52980b