Using Semantic Analysis to Classify Search Engine Spam
Source: Stanford University
Due to the similarities between spam and non-spam their original semantic analyzers are not an effective method to classify spam content. Since spam and non-spam documents are so similar, it is sometimes very difficult for a human to differentiate between the two. Because of these similarities, it is unlikely that any natural language analysis method will be successful in differentiating between spam and non-spam. However, using semantic analyzers to determine the usefulness of information on a webpage had much more promising results. Assuming the user is more interested in finding a quick answer to their query, a page with more textual information should have a higher rank. Their analyzers could help to determine this rank.