Web Spam Detection: Link-Based and Content-Based Techniques
The Web is both an excellent medium for sharing information as well as an attractive platform for delivering products and services. This platform is, to some extent, mediated by search engines in order to meet the needs of users seeking information. Search engines are the "Dragons" that keep a valuable treasure: information. Given the vast amount of information available on the Web, it is customary to answer queries with only a small set of results (typically 10 or 20 pages at most). Search engines must then rank Web pages, in order to create a short list of high-quality results for users. Web spam can significantly deteriorate the quality of search engine results. Here the paper presents the main techniques recently introduced for Web Spam detection e demotion.