Back to Search
Start Over
Nova
- Source :
- SIGMOD Conference
- Publication Year :
- 2011
- Publisher :
- ACM, 2011.
-
Abstract
- This paper describes a workflow manager developed and deployed at Yahoo called Nova, which pushes continually-arriving data through graphs of Pig programs executing on Hadoop clusters. (Pig is a structured dataflow language and runtime for the Hadoop map-reduce system.)Nova is like data stream managers in its support for stateful incremental processing, but unlike them in that it deals with data in large batches using disk-based processing. Batched incremental processing is a good fit for a large fraction of Yahoo's data processing use-cases, which deal with continually-arriving data and benefit from incremental algorithms, but do not require ultra-low-latency processing.
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
- Accession number :
- edsair.doi...........ef52303e35d38b590027bc0f2406a98b
- Full Text :
- https://doi.org/10.1145/1989323.1989439