Date Added: Nov 2012
Keyphrases, synonymously spoken as keywords, represent semantic meta-data and play an important role to capture the main theme represented by a large text data collection. Although authors provide a list of about five to ten keywords in scientific publications that are used to map them to respective domains, due to exponential growth of non-scientific documents either on the World Wide Web or in textual databases, an automatic mechanism is sought to identify keyphrases embedded within them. In this paper, the authors propose the design of a light-weight machine learning approach to identify feasible keyphrases in text documents. The proposed method mines various lexical and semantic features from texts to learn a classification model.