Cloudflare Flags Perplexity for Internet Crawling Violations

Cloudflare Accuses AI Startup of ‘Stealth Crawling Behavior’ Across Millions of Sites

Cloudflare Accuses AI Startup of ‘Stealth Crawling Behavior’ Across Millions of Sites

Image: AndersonPiza/Envato Elements

Cloudflare is accusing Perplexity of using stealth crawlers to bypass site restrictions, triggering fresh concerns over how AI firms access web content.

Written By
Liz Ticong
Liz Ticong
Aug 5, 2025

Cloudflare is accusing AI startup Perplexity of using stealth crawlers to bypass website restrictions and access content in violation of internet crawling norms.

In a recent blog post, Cloudflare alleges that Perplexity deployed undeclared bots designed to slip past traditional defenses and avoid detection. The company says the activity spans millions of automated requests and has triggered updated countermeasures.

Unauthorized access to private domains

According to Cloudflare, it began investigating Perplexity after receiving reports from website operators who had blocked the company’s official crawlers but continued seeing their content appear in Perplexity’s results. To test the claims, Cloudflare created newly registered, undiscoverable domains and configured them to deny access to all bots.

Despite those protections, Cloudflare says Perplexity was still able to retrieve and surface specific content from the restricted test sites. The company alleges that Perplexity bypassed both robots.txt directives and web application firewall (WAF) rules in doing so.

Bots disguised as browsers

Cloudflare says the content was accessed using undisclosed bots that didn’t identify themselves as belonging to Perplexity. These crawlers reportedly posed as ordinary browsers by mimicking common user agents such as Chrome on macOS.

The traffic also came from IP addresses outside of Perplexity’s documented range. Cloudflare states the bots rotated through different IPs and even changed Autonomous System Numbers (ASNs) to avoid detection and blocking.

Cloudflare attributes millions of these stealth requests to Perplexity each day, spread across tens of thousands of domains. The company claims it was able to fingerprint the activity using network signals and machine learning.

Advertisement

Perplexity’s web crawlers

Perplexity states that it uses two bots: one for search indexing and another to fetch content in response to user questions. Both operate under declared user agents, respect published IP ranges, and are not used to train AI models.

These crawlers are documented on Perplexity’s website, but Cloudflare’s allegations center on traffic coming from undeclared sources, outside the scope of what the company publicly describes.

Concerns about how Perplexity accesses content aren’t new. In 2024, multiple reports claimed the company was scraping websites that had blocked bots, relying on unlisted IPs and external crawling tools. Amazon later confirmed it was reviewing whether this breached the AWS terms of service.

More recently, the BBC sent a legal letter accusing Perplexity of reproducing its content without permission and bypassing robots.txt restrictions it had placed on the company’s declared bots.

Just a sales pitch?

Perplexity disputed Cloudflare’s allegations in an email to TechCrunch. Spokesperson Jesse Dwyer called Cloudflare’s blog post a “sales pitch” and said the screenshots cited showed no content was accessed. He added that the bot named in the report is not operated by the company.

In other cybersecurity news, AI agents are creating insider security threat blind spots.

Liz Ticong

Liz Ticong is a staff writer for eWeek and TechRepublic focused on AI, cybersecurity, enterprise software, and data. She has more than 10 years of editorial experience as a technology industry writer, combining reporting, product research, and hands-on software testing in her coverage. Her work has been published on Datamation, Enterprise Networking Planet, and TechnologyAdvice.com. She writes technology news, software reviews, product comparisons, and buyer’s guides for business and IT readers.