Correlation Coefficient Based Average Textual Similarity Model for Information Retrieval System in Wide Area Networks
In wide area networks, retrieving the relevant text is a challenging task for information retrieval because most of the information requests are text based. This paper is on the similarity measurement, performance evaluation and design of information retrieval techniques using the four similarity functions i.e. Jaccard, cosine, dice and overlap. The performance evaluation of these similarity functions has been done for the similarity between the documents retrieved by the search engine for the entered text using the vector space model.