Big Data

Clustering Items in Different Data Sources Induced by Stability

Date Added: Oct 2009
Format: PDF

Many multi-branch companies transact from different branches. Each branch of such a company maintains a separate database over time. The variation of sales of an item over time is an important issue. Thus, the authors introduce the notion of stability of an item. Stable items are useful in making many strategic decisions for a company. Based on the degree of stability of an item, they design an algorithm for clustering items in different data sources. They have proposed the notion of best cluster by considering average degree of variation of a class. Also, they have designed an alternative algorithm to find best cluster among items in different data sources. Experimental results are provided on three transactional databases.