Perplexity is accused of employing stealth bots to circumvent websites' no-crawl directives despite site owners blocking its crawlers through robots.txt files and firewalls. Cloudflare reported that customers received complaints about Perplexity scraping content. Testing revealed that Perplexity utilized undeclared crawlers that employed various tactics to mask their activity. These tactics included rotating multiple IPs not officially registered to Perplexity and accessing sites from different networks to evade blocks. This alleged behavior contradicts long-standing Internet norms established since the creation of the Robots Exclusion Protocol in 1994.
"This undeclared crawler utilized multiple IPs not listed in Perplexity's official IP range, and would rotate through these IPs in response to the restrictive robots.txt policy and block from Cloudflare."
"In addition to rotating IPs, we observed requests coming from different ASNs in attempts to further evade website blocks. This activity was observed across tens of thousands of domains and millions of requests per day."
Collection
[
|
...
]