International Journal of Advanced Research in Computer Engineering & Technology
In many text mining applications, side-information is available. The side-information is along with the text documents. It contains different kinds, such as document provenance information, the links in the document, user-access behavior from web logs, or other non-textual attributes which are included into the text document. The attributes may contain an excellent amount of information for clustering purposes. It can be insecure to assimilate side-information into the mining process using in this cases. Because it can either improve the quality of the representation for the mining process, or can add noise to the process.