A Data Summarisation Approach to Knowledge Discovery
Source: University of York
Knowledge discovery in both structured and unstructured datasets stored in large repository database systems has always motivated methods for data summarisation. Summarisation is closely related to compression, machine learning, and data mining. The closest connection is to data mining. Data summarisation methods for the unstructured domain usually involve text categorisation which groups together documents that share similar characteristics. With the ever growing number of text documents in large database systems, algorithms for text summarisation in the unstructured domain, such as document clustering, are often limited by the dimensionality of the data features.