Author: "Valerie Hayot-Sasson" / Publisher: arxiv - Searchworks@Jio Institute Digital Library Search Results

1. Evaluation of pilot jobs for Apache Spark applications on HPC clusters

Author: Tristan Glatard and Valerie Hayot-Sasson
Subjects: Job scheduler, FOS: Computer and information sciences, Queueing theory, Computer Science - Performance, business.industry, Computer science, media_common.quotation_subject, Big data, Supercomputer, computer.software_genre, Scheduling (computing), Performance (cs.PF), Debugging, Computer Science - Distributed, Parallel, and Cluster Computing, Software deployment, Operating system, Distributed, Parallel, and Cluster Computing (cs.DC), business, computer, media_common
Abstract: Big Data has become prominent throughout many scientific fields, and as a result, scientific communities have sought out Big Data frameworks to accelerate the processing of their increasingly data-intensive pipelines. However, while scientific communities typically rely on High-Performance Computing (HPC) clusters for the parallelization of their pipelines, many popular Big Data frameworks such as Hadoop and Apache Spark were primarily designed to be executed on dedicated commodity infrastructures. This paper evaluates the benefits of pilot jobs over traditional batch submission for Apache Spark on HPC clusters. Surprisingly, our results show that the speed-up provided by pilot jobs over batch scheduling is moderate to non-existent (0.98 on average) despite the presence of long queuing times. In addition, pilot jobs provide an extra layer of scheduling that complicates debugging and deployment. We conclude that traditional batch scheduling should remain the default strategy to deploy Apache Spark applications on HPC clusters.
Published: 2019
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

1 results on '"Valerie Hayot-Sasson"'

1. Evaluation of pilot jobs for Apache Spark applications on HPC clusters

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Language

Database

1 results on '"Valerie Hayot-Sasson"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources