Back to Search Start Over

Nova

Authors :
Chao Tian
Mattias Larsson
Francis Liu
Christopher Olston
Topher ZiCornell
Andreas Neumann
Xiaodan Wang
Siddharth Seth
Vijayanand Sankarasubramanian
Vellanki B.N. Rao
Yiping Han
Laukik Chitnis
Greg I. Chiou
Source :
SIGMOD Conference
Publication Year :
2011
Publisher :
ACM, 2011.

Abstract

This paper describes a workflow manager developed and deployed at Yahoo called Nova, which pushes continually-arriving data through graphs of Pig programs executing on Hadoop clusters. (Pig is a structured dataflow language and runtime for the Hadoop map-reduce system.)Nova is like data stream managers in its support for stateful incremental processing, but unlike them in that it deals with data in large batches using disk-based processing. Batched incremental processing is a good fit for a large fraction of Yahoo's data processing use-cases, which deal with continually-arriving data and benefit from incremental algorithms, but do not require ultra-low-latency processing.

Details

Database :
OpenAIRE
Journal :
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Accession number :
edsair.doi...........ef52303e35d38b590027bc0f2406a98b
Full Text :
https://doi.org/10.1145/1989323.1989439