An Unsupervised Model to detect Web Spam Based on Qualified Link Analysis and Language Models

Provided by: Anna University
Topic: Security
Format: PDF
With the massive use of the internet and the search engines, a major problem that comes to light is the Web Spam. Web spam can be detected by analyzing the various features of web pages and categorizing them as belonging to the spam or non-spam category. The proposed work considers unsupervised learning algorithms to characterize the web pages based on the link based features and content based features to compare the difference between the various sources of information in the source and target page. An unsupervised learning technique that is initially considered is the Hidden Markov Model which captures the different browsing patterns of users.

Find By Topic