A Study of Link Farm Distribution and Evolution Using a Time Series of Web Snapshots
In this paper, the authors study the overall link-based spam structure and its evolution which would be helpful for the development of robust analysis tools and research for Web spamming as a social activity in the cyber space. First, the authors use Strongly Connected Component (SCC) decomposition to separate many link farms from the largest SCC, so called the core. They show that denser link farms in the core can be extracted by node filtering and recursive application of SCC decomposition to the core. Surprisingly, they can find new large link farms during each iteration and this trend continues until at least 10 iterations. In addition, they measure the spamicity of such link farms. Next, the evolution of link farms is examined over two years.