Continuous Analytics Over Discontinuous Streams
Continuous analytics systems that enable query processing over streams of data have emerged as key solutions for dealing with massive data volumes and demands for low latency. These systems have been heavily influenced by an assumption that data streams can be viewed as sequences of data that arrived more or less in order. The reality, however, is that streams are not often so well behaved and disruptions of various sorts are endemic. The authors argue, therefore, that stream processing needs a fundamental rethink and advocate a unified approach toward continuous analytics over discontinuous streaming data. The approach is based on a simple insight - using techniques inspired by data parallel query processing, queries can be performed over independent sub-streams with arbitrary time ranges in parallel, generating partial results.