Sour Scrapes; (Anti)-trust The Process | AdExchanger
Briefly

Sour Scrapes; (Anti)-trust The Process | AdExchanger
"Reddit has cultivated a lucrative business licensing its data to LLMs, so it's heavily incentivized to crack down on companies selling data without a licensing agreement. What makes this case intriguing is how the data was gathered - not directly from Reddit but indirectly through Google's crawlers via third-party scraping vendors. In other words, the information is technically public, but the path to obtaining it is anything but straightforward."
"The scrapers included SerpApi, Oxylabs and AWMProxy, three of the four companies named in the lawsuit, which then combined the scraped content with other data and resold it. Reddit alleges that the fourth named company, Perplexity (yes, that Perplexity), was one of the buyers of such illicitly sold data. Perplexity's business model, according to the lawsuit, is to take Reddit's content from Google's search results, feed it into its AI model and "call it a new product.""
"As of January, Google began offering post-auction discounts and direct agreements (contractual terms negotiated directly between parties), a practice common across the media industry but previously unheard of in AdX's historically rigid approach. But it's happening - and this AdX thaw comes just after a guilty conviction targeting Google's sell-side tech (which is to say, AdX). TikTok's US operations are being transferred to a new ownership group following regulatory mandates."
Reddit has built a profitable business licensing its content to large language models and is pursuing legal action against companies that sold its data without agreements. The data was obtained not directly from Reddit but via Google's crawlers and third-party scraping vendors, making publicly accessible information subject to contested acquisition paths. Named scrapers include SerpApi, Oxylabs and AWMProxy, which combined scraped content with other data and resold it. Reddit alleges Perplexity purchased such data and used Reddit content from Google search results to feed AI models and market outputs as new products. Regulatory shifts are also altering ad tech and ecommerce dynamics.
Read at AdExchanger
Unable to calculate read time
[
|
]