Knowledge Discovery From Web Usage Data: Complete Preprocessing Methodology
Web log data is usually diverse and voluminous. This data must be assembled into a consistent, integrated and comprehensive view, in order to be used for pattern discovery. As in most data mining applications, data preprocessing involves removing and filtering redundant and irrelevant data, removing noise, transforming and resolving any inconsistencies. Data preprocessing has a fundamental role in KDWUD applications. A significant problem with most of the pattern discovery methods is that, their difficulty in handling very large scales of WUD. Despite the fact that, most of the KDWUD processes done off-line, the size of WUD is in the orders of magnitude larger than those met in common applications of machine learning.