Extracting Spam Blogs With Co-Citation Clusters

Source: Association for Computing Machinery

Favorite

Free registration required

This paper reports the estimated number of spam blogs in order to assess their current state in the blogosphere. To extract spam blogs, they developed a traversal method among co-citation clusters of blogs from a spam seed. Spam seeds were collected in terms of high out-degree and spam keyword. According to the experiment, a mixed seed set composed of high out-degree and spam keyword seeds is more effective than individual seed sets in terms of FMeasure. In conclusion, mixed seeds from different methods are effective in improving the F-Measure results of spam extraction with co-citation clusters.
Format:PDF Size:234.60
Date:Apr 2008