Science and Development Network (SciDev.Net)
Data clustering of RSS content is studied by using hierarchical clustering method. Based on a randomly chosen RSS list a words matrix is constructed. In this matrix each row represents a RSS title and stores the words that exist in the RSS according to a ratio. By using Pearson correlation coefficient the closeness between different RSSs content is computed and the result is used by hierarchical clustering algorithm. The tree-like graph is drawn to describe the hierarchical relationship.