Download now Free registration required
The authors present SGuard, a new fault-tolerance technique for distributed Stream Processing Engines (SPEs) running in clusters of commodity servers. SGuard is less disruptive to normal stream processing and leaves more resources available for normal stream processing than previous proposals. Like several previous schemes, SGuard is based on rollback recovery : it checkpoints the state of stream processing nodes periodically and restarts failed nodes from their most recent checkpoints. In contrast to previous proposals, however, SGuard performs checkpoints asynchronously: i.e., operators continue processing streams during the checkpoint thus reducing the potential disruption due to the checkpointing activity.
- Format: PDF
- Size: 694.87 KB