Discretized Streams: An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters

Many important "Big data" applications need to process data arriving in real time. However, current programming models for distributed stream processing are relatively low-level, often leaving the user to worry about consistency of state across the system and fault recovery. Furthermore, the models that provide fault recovery do so in an expensive manner, requiring either hot replication or long recovery times. The authors propose a new programming model, Discretized Streams (D-Streams), that offers a high-level functional programming API, strong consistency, and efficient fault recovery.

Provided by: University of California Topic: Data Centers Date Added: May 2012 Format: PDF

Find By Topic