Big Data

Debellor: Open Source Modular Platform for Scalable Data Mining

Date Added: May 2009
Format: PDF

This paper introduces Debellor ( - an open source extensible data mining platform with stream-oriented architecture, where all data transfers between elementary algorithms take the form of a stream of samples. Data streaming enables implementation of scalable algorithms, which can efficiently process large volumes of data, exceeding available memory. This is very important for data mining research and applications, since the most challenging data mining tasks involve voluminous data, either produced by a data source or generated at some intermediate stage of a complex data processing network. Advantages of data streaming are illustrated by experiments with clustering time series.