Nullification Test Collections for Web Spam and SEO
Research in the area of adversarial information retrieval has been facilitated by the availability of the UK-2006/UK-2007 collections, comprising crawl data, link graph, and spam labels. However, research into nullifying the negative effect of spam or excessive Search Engine Optimisation (SEO) on the ranking of non-spam pages is not well supported by these resources. Nor is the study of cloaking techniques or of click spam. Finally, the domain-restricted nature of a .uk crawl means that only parts of link-farm icebergs may be visible in these crawls. The paper introduces the term nullification which it defines as "Preventing problem pages from negatively affecting search results".