Big Data

Intelligent Linux Information Access by Data Mining: the ILIAD Project

Date Added: Jun 2010
Format: PDF

The authors' propose an alternative to conventional information retrieval over Linux forum data, based on thread, post and user-level analysis, interfaced with an information retrieval engine via reranking. Due to the sheer scale of web data, simple keyword matching is an effective means of information access for many informational web queries. There still remain significant clusters of information access needs, however, where keyword matching is less successful. This paper provides an outline of the ILIAD project, focusing on the tasks of crawling, thread-level analysis, post-level analysis, user-level analysis and IR reranking. The authors have designed a series of class sets for the component tasks, and carried out experimentation over a range of data sources, achieving encouraging results.