Characterizing Comment Spam in the Blogosphere Through Content Analysis

Free registration required

Executive Summary

Spams are no longer limited to emails and webpages. The increasing penetration of spam in the form of comments in blogs and social networks has started becoming a nuisance and potential threat. In this work, the author explores the challenges posed by this type of spam in the blogosphere with substantial generalization regarding other social media. Thus, the paper investigates the characteristics of comment spam in blogs based on their content. The framework uses some of the previously explored methods developed to effectively extract the features of the blog spam and also introduces a novel method of active learning from the raw data without requiring training instances. This makes the approach more flexible and realistic for such applications.

  • Format: PDF
  • Size: 417.9 KB