Cloudflare has accused the AI startup Perplexity of ignoring websites' 'no crawl' directives, evading these blocks by disguising its web crawlers as ordinary web browsers. Despite explicit prohibitions in robots.txt files and Web Application Firewall rules, Perplexity allegedly used multiple unlisted IP addresses to scrape content from sites. This issue arose after complaints from Cloudflare customers who had already implemented measures to block Perplexity's crawling activities, prompting Cloudflare to develop new services aimed at preventing aggressive AI crawlers from misusing site content.
Cloudflare accused Perplexity of circumventing 'no crawl' directives by disguising its web crawler as a standard Chrome browser to access blocked content.
Despite explicit blocks in robots.txt files and Web Application Firewall rules, Cloudflare's investigation revealed that Perplexity was still able to access the content.
Customers of Cloudflare reported that their disallowed Perplexity crawling activity was still resulting in their content being used, prompting investigations.
Cloudflare introduced new services to prevent aggressive AI crawlers from bypassing site restrictions, addressing the issue of unauthorized content scraping.
Collection
[
|
...
]