The International Journal of Innovative Research in Computer and Communication Engineering
Initially the first phase derives the genetic algorithm for global clustering process to resolve the optimization solution in both clustering and feature selection. The second phase follows a concept of confusion matrix for derivative works and improved GA is included for the final classification. The third phase presents the optimization technique to evaluate the cluster optimality for proficient document clustering based on the optimized conceptual feature words. Final phase introduce a join approach to cluster the web pages which primarily finds the recurrent sets and then clusters the documents.