International Journal of Advanced Research in Computer Science and Software Engineering (IJARCSSE)
The explosive amount of information present on the internet attracted many users. Due to the unstructured property of the data present on the internet the users are unable to retrieve the needed information in efficient manner. The authors concentrated on providing related pages which are of current interest to the user. To make this happen they collected different hyperlinks, transformed them to documents and used the numerical measures like Euclidian distance and Cosine similarity to measure the orientation of the websites to each other. Then applied the clustering algorithm to find out which of them are more associated to each other and are likely to form a cluster.