Back to Search
Start Over
Design of a Large-scale Task Dispatching & Processing System based on Hadoop
- Source :
- Journal of KIISE. 43:613-620
- Publication Year :
- 2016
- Publisher :
- Korean Institute of Information Scientists and Engineers, 2016.
-
Abstract
- This paper presents a MOHA(Many-Task Computing on Hadoop) framework which aims to effectively apply the Many-Task Computing(MTC) technologies originally developed for high-performance processing of many tasks, to the existing Big Data processing platform Hadoop. We present basic concepts, motivation, preliminary results of PoC based on distributed message queue, and future research directions of MOHA. MTC applications may have relatively low I/O requirements per task. However, a very large number of tasks should be efficiently processed with potentially heavy inter-communications based on files. Therefore, MTC applications can show another pattern of data-intensive workloads compared to existing Hadoop applications, typically based on relatively large data block sizes. Through an effective convergence of MTC and Big Data technologies, we can introduce a new MOHA framework which can support the large-scale scientific applications along with the Hadoop ecosystem, which is evolving into a multi-application platform.
Details
- ISSN :
- 2383630X
- Volume :
- 43
- Database :
- OpenAIRE
- Journal :
- Journal of KIISE
- Accession number :
- edsair.doi...........f645e4a9531de188195f5979cc393a03
- Full Text :
- https://doi.org/10.5626/jok.2016.43.6.613