Semantic Based Model for Text Document Clustering with Idioms

Provided by: Computer Science Journals
Topic: Data Management
Format: PDF
In this paper, the authors address the task of developing an effective and efficient clustering methodology to take care of semantic structure of the text documents. A method has been developed that performs the following sequence of operations: tagging the documents for parsing, replacement of idioms with their original meaning, semantic weights calculation for document words and apply semantic grammar. The similarity measure is obtained between the documents and then the documents are clustered using Hierarchical clustering algorithm. The method adopted in this work is evaluated on different data sets with standard performance measures and the effectiveness of the method to develop in meaningful clusters has been proved.

Find By Topic