The International Journal of Innovative Research in Computer and Communication Engineering
Data mining can extract important knowledge from large data collections, but sometimes these collections are split among various parties. This paper addresses a fast distributed mining of association rules over horizontally distributed data. While preparing a data set for analysis is generally the most time consuming task in a data mining, requiring numerous complex SQL queries, joining tables and aggregating columns. Existing SQL aggregations have limitations to prepare data sets because they return one column per aggregated group.