International Journal of Computer Technology and Electronics Engineering
Identification of useful clusters in large datasets has attracted considerable interest in clustering process. Since data in the World Wide Web (WWW) is increasing exponentially that affects on clustering accuracy and decision making, change in the concept between every cluster occurs named concept drift. To detect the difference of cluster distributions between the current data subset and previous clustering result, an algorithm called Drifting Concept Detection (DCD) and proper data labeling need to be performed. To say that the data labeling was performed well, generated clusters must be efficient.