Back to Search Start Over

Design of a Large-scale Task Dispatching & Processing System based on Hadoop

Authors :
Jik-Soo Kim
Soonwook Hwang
Nguyen Cao
Seoyoung Kim
Source :
Journal of KIISE. 43:613-620
Publication Year :
2016
Publisher :
Korean Institute of Information Scientists and Engineers, 2016.

Abstract

This paper presents a MOHA(Many-Task Computing on Hadoop) framework which aims to effectively apply the Many-Task Computing(MTC) technologies originally developed for high-performance processing of many tasks, to the existing Big Data processing platform Hadoop. We present basic concepts, motivation, preliminary results of PoC based on distributed message queue, and future research directions of MOHA. MTC applications may have relatively low I/O requirements per task. However, a very large number of tasks should be efficiently processed with potentially heavy inter-communications based on files. Therefore, MTC applications can show another pattern of data-intensive workloads compared to existing Hadoop applications, typically based on relatively large data block sizes. Through an effective convergence of MTC and Big Data technologies, we can introduce a new MOHA framework which can support the large-scale scientific applications along with the Hadoop ecosystem, which is evolving into a multi-application platform.

Details

ISSN :
2383630X
Volume :
43
Database :
OpenAIRE
Journal :
Journal of KIISE
Accession number :
edsair.doi...........f645e4a9531de188195f5979cc393a03
Full Text :
https://doi.org/10.5626/jok.2016.43.6.613