Anthropic and OpenAI are crawling the web even more and not giving much back
Briefly

Anthropic and OpenAI are crawling the web even more and not giving much back
"This is one of the most under-discussed parts of the AI revolution. While tech companies spend lavishly on data centers, GPUs, and talent, they avoid talking about the other key ingredient of AI success: data. That's because they don't want to pay for the high-quality human data that's needed for AI model training, inference, and AI outputs. Instead, they send out bots to crawl websites and scoop up this information, mostly for free."
"This formed the grand bargain of the web. Sites would let their data be taken for free on the understanding that they would get referrals in return, and could pay for their efforts through advertising, subscriptions, and other techniques. In the new generative AI world, this deal is breaking down. Now, AI answer engines and chatbots give users direct answers, making people less likely to visit the websites that created and verified the data in the first place."
Cloudflare measured Big Tech bots' crawl requests and the referral traffic that platforms send to sites, covering about 20% of the world's websites. The crawl-to-refer ratio compares bot crawls to referrals and reveals how much value platforms return to publishers. Recent Cloudflare data shows the crawl-to-refer ratio has worsened since early September, with Anthropic and OpenAI crawling heavily while sending very few referrals. Generative AI answers reduce user visits to source sites, undermining the prior bargain where publishers allowed free use of content in exchange for referral traffic and monetization opportunities.
Read at Business Insider
Unable to calculate read time
[
|
]