The web has a new system for making AI companies pay up
Really Simple Licensing (RSL) lets web publishers specify licensing and royalty terms in robots.txt and other content to require payment for AI training-data scraping.
Amazon Gets Scraped, Too; LinkedIn Loves Video | AdExchanger
AI companies are crawling Amazon for shopping data, LinkedIn is expanding invite-only video revenue-sharing, and platform competition is generating disputes between Google and Fox.
Asahi, Nikkei sue Perplexity AI for copyright infringement
Perplexity faces a copyright lawsuit from Japan's Nikkei and Asahi alleging unlawful scraping, robots.txt violations, and seeking injunctions plus ¥2.2 billion damages per firm.
Good web crawlers support HTTP/2, identify via user-agent, respect robots.txt, follow caching and redirects, back off on slow servers, and expose crawl details.