Exploring Linguistic Features for Web Spam Detection: A Preliminary Study
Source: Association for Computing Machinery
This paper studies the usability of linguistic features in the Web spam classification task. The features were computed on two Web spam corpora: Webspam-Uk2006 and Webspam-Uk2007, they make them publicly available for other researchers. Preliminary analysis seems to indicate that certain linguistic features may be useful for the spam-detection task when combined with features studied elsewhere.
| Format: | Size: | 238.10 | |
| Date: | Apr 2008 |



