Big Data

A Two-stage Information Filtering Based on Rough Decision Rule and Pattern Mining

Date Added: Nov 2010
Format: PDF

Information Overload and Mismatch are two fundamental problems affecting the effectiveness of information filtering systems. Even though both term-based and pattern-based approaches have been proposed to address the problems of overload and mismatch, neither of these approaches alone can provide a satisfactory solution to address these problems. This paper presents a novel two-stage information filtering model which combines the merits of term-based and pattern-based approaches to effectively filter sheer volume of information. In particular, the first filtering stage is supported by a novel rough analysis model which efficiently removes a large number of irrelevant documents, thereby addressing the overload problem. The second filtering stage is empowered by a semantically rich pattern taxonomy mining model.