On the Utilization Aspect of Document Data for Mining the Side Information

Provided by: IRD India
Topic: Data Management
Format: PDF
In text mining applications, side-information is also available along with the text documents. This side-information can be like document provenance information, links existing inside the document, web logs based on user-access behavior, or non-textual attributes which exist in the text document. Such attributes will contain remarkable amount of information for clustering purposes. Usually it's difficult to estimate the importance of this side-information when they are noisy. In these scenarios, there is a huge amount of risk involved in incorporating this side-information into the mining process, since they can add noise to the process rather than improving the quality of the mining process.

Find By Topic