Levels of Awareness: Design Considerations for Web Crawlers and Censorware Detection
Search engines are tremendous force-multipliers for end-hosts trying to discover content on the Web. As the amount of content online grows, so does dependence on web crawlers to discover content. The desire for adversaries to censor search engines from detecting content on the Internet has scaled accordingly. Web crawlers, programs that algorithmically traverse links to explore the Web, fall victim to the same censorship infrastructure (censorware) tricks and attacks that prevents typical end-hosts from accessing "Controversial" content.