Identifying Spam Web Pages Based on Content Similarity

Source: Springer Science+Business Media

Favorite

Free registration required

The Web provides its users with abundant information. Unfortunately, when a Web search is performed, both users and search engines are faced with an annoying problem: the presences of misleading Web pages, i.e., spam Web pages, that are ranked among legitimate Web pages. The mixed results downgrade the performance of search engines and frustrate users who are required to filter out useless information. In order to improve the quality of Web searches, the number of spam pages on the Web must be reduced, if they cannot be eradicated entirely.
Format:PDF Size:989.90
Date:May 2008