Document Clustering Using Learning From Examples
Information Filtering (IF) systems usually filter data items by correlating a set of terms representing the user's interest with similar sets of terms representing the data items. Many techniques have been employed for constructing user profiles automatically, but they usually yield large sets of data. Various dimensionality-reduction techniques can be applied in order to reduce the number of terms in a user query. A new framework is described to classify large scale documents and retrieve the documents related to the user's query based on the application of trained Artificial Neural Network (ANN) model. Its novel feature is the identification of an optimal set of documents that are relevant to the user.