RWTH Aachen University
Distributed, highly-parallel processing frameworks as Hadoop are deemed to be state-of-the-art for handling big data today. But they burden application developers with the task to manually implement program logic using low-level batch processing APIs. Thus, a movement can be observed that high-level languages are developed which allow to declaratively modeling data flows that are automatically optimized and mapped to the batch-processing backends. However, most of these systems are based on programming models as MapReduce that provide elasticity and fault-tolerance in a natural manner since intermediate results are materialized and therefore processes can simply be restarted and scaled with partitioning input datasets.