#ai-training-data
#ai-training-data

[ follow ]

The Laid-off Scientists and Lawyers Training AI to Steal Their Careers

Unemployed workers are being recruited by companies like Mercor to generate training data for AI systems, ironically replacing the jobs AI has already automated.

Artificial intelligence

fromThe Verge

4 days ago

The laid-off lawyers and PhDs training AI to steal their careers

Unemployed workers are being recruited by companies like Mercor to create training data for AI systems, often the same technology that displaced them from their jobs.

Roam Research

fromwww.socialmediatoday.com

4 days ago

X adds Grok-powered audio option to long-form articles

X introduced audio playback for long-form articles using Grok AI's voice, enabling background listening to boost creator engagement and content consumption while improving AI training data quality.

#privacy-violation

fromZDNET

5 days ago

Privacy technologies

Can Meta see your private life through its Ray-Ban smart glasses? What to know

Meta contractors in Kenya accessed sensitive videos from Ray-Ban smart glasses, including footage of people undressing and in bathrooms, often without wearers' knowledge.

fromArs Technica

6 days ago

Privacy professionals

Workers report watching Ray-Ban Meta-shot footage of people using the bathroom

Meta subcontractor employees in Kenya have accessed sensitive footage from Ray-Ban Meta smart glasses, including intimate moments, raising serious privacy concerns about data annotation practices.

fromZDNET

5 days ago

Privacy technologies

Can Meta see your private life through its Ray-Ban smart glasses? What to know

fromArs Technica

6 days ago

Privacy professionals

Workers report watching Ray-Ban Meta-shot footage of people using the bathroom

more#privacy-violation

Intellectual property law

fromEngadget

5 days ago

UK government delays AI copyright rules amid artist outcry

The UK government delayed its AI data bill after stakeholder consultation revealed opposition to allowing AI companies to train models on copyrighted materials without creator consent.

Science

fromArtforum

1 week ago

Recursive Resemblance

Generative AI models risk collapse when trained on their own output, causing statistical degradation and improbable sequences that compound approximation errors over time.

Tech industry

fromThe Verge

1 week ago

Your smart TV may be crawling the web for AI

Bright Data offers streaming services an ad-free monetization alternative by converting smart TVs into residential proxies that collect web data for resale to AI companies.

fromSearch Engine Roundtable

2 weeks ago

Anthropic Updates Its Crawler Documentation: ClaudeBot, Claude-User & Claude-SearchBot

ClaudeBot helps enhance the utility and safety of our generative AI models by collecting web content that could potentially contribute to their training. When a site restricts ClaudeBot access, it signals that the site's future materials should be excluded from our AI model training datasets.

Privacy technologies

fromEntrepreneur

2 weeks ago

Most Founders Don't Realize They're Giving Away Their Influence - Here's How to Take It Back

Every search, purchase, loyalty swipe, location ping and scroll feeds systems that now shape pricing, product decisions, hiring and marketing strategies. Most founders understand this in theory, but few grasp the practical consequence: whether they intend to or not, they and their customers are already casting votes with their data. And those votes? They're usually cast passively, on someone else's terms.

Data science

#copyright

fromIPWatchdog.com | Patents & Intellectual Property Law

5 months ago

Intellectual property law

Plaintiffs Propose Plan for Landmark $1.5 Billion Copyright Settlement Process with Anthropic

fromEngadget

1 month ago

Artificial intelligence

Music publishers sue Anthropic for $3 billion over 'flagrant piracy'

fromEngadget

2 months ago

Intellectual property law

New York Times reporter files lawsuit against AI companies

fromLawSites

3 months ago

Intellectual property law

Thomson Reuters Tells Appeals Court: ROSS's Copying Was 'Theft, Not Innovation'

fromwww.theguardian.com

3 months ago

Germany news

ChatGPT violated copyright law by harvesting musicians' lyrics, German court rules

fromBusiness Matters

4 months ago

Intellectual property law

AI firm Stability AI wins High Court case against Getty Images over copyright claims

fromIPWatchdog.com | Patents & Intellectual Property Law

5 months ago

Intellectual property law

Plaintiffs Propose Plan for Landmark $1.5 Billion Copyright Settlement Process with Anthropic

fromEngadget

1 month ago

Artificial intelligence

Music publishers sue Anthropic for $3 billion over 'flagrant piracy'

fromEngadget

2 months ago

Intellectual property law

New York Times reporter files lawsuit against AI companies

fromLawSites

3 months ago

Intellectual property law

Thomson Reuters Tells Appeals Court: ROSS's Copying Was 'Theft, Not Innovation'

fromwww.theguardian.com

3 months ago

Germany news

ChatGPT violated copyright law by harvesting musicians' lyrics, German court rules

fromBusiness Matters

4 months ago

Intellectual property law

AI firm Stability AI wins High Court case against Getty Images over copyright claims

Social media marketing

Reddit INSIDER sends major vote of confidence after earnings

fromTheStreet

3 weeks ago

Artificial intelligence

Reddit INSIDER sends major vote of confidence after earnings

fromSocial Media Today

4 months ago

Artificial intelligence

Reddit Launches Legal Action to Block AI Companies from Scraping its Data

fromFast Company

4 months ago

Tech industry

Reddit sues Perplexity and others for allegedly scraping millions of user comments

fromMacon Telegraph

3 weeks ago

Social media marketing

Reddit INSIDER sends major vote of confidence after earnings

fromTheStreet

3 weeks ago

Artificial intelligence

Reddit INSIDER sends major vote of confidence after earnings

fromSocial Media Today

4 months ago

Artificial intelligence

Reddit Launches Legal Action to Block AI Companies from Scraping its Data

fromFast Company

4 months ago

Tech industry

Reddit sues Perplexity and others for allegedly scraping millions of user comments

Daily Tech Insider Maps the AI Arms Race From Silicon Valley to the Moon

Major tech companies are committing massive AI infrastructure spending, accelerating deployment, concentrating control, and driving job and market disruptions.

Artificial intelligence

fromPetaPixel

3 weeks ago

Amazon May Launch Marketplace for Publishers to Sell Content to AI Firms

Amazon is exploring a content marketplace enabling publishers to license articles and data directly to AI companies to replace web scraping and monetize content.

#content-licensing

fromTechCrunch

4 weeks ago

Artificial intelligence

Amazon may launch a marketplace where media sites can sell their content to AI companies | TechCrunch

fromCNET

5 months ago

Artificial intelligence

Online Media Brands Hope a New Protocol Will Stop Unwanted AI Crawlers

fromTechCrunch

4 weeks ago

Artificial intelligence

Amazon may launch a marketplace where media sites can sell their content to AI companies | TechCrunch

fromCNET

5 months ago

Artificial intelligence

Online Media Brands Hope a New Protocol Will Stop Unwanted AI Crawlers

more#content-licensing

#web-scraping

fromArs Technica

1 month ago

Business

Increase of AI bots on the Internet sparks arms race

fromBusiness Insider

1 month ago

Artificial intelligence

Anthropic and OpenAI are crawling the web even more and not giving much back

fromArs Technica

1 month ago

Business

Increase of AI bots on the Internet sparks arms race

fromBusiness Insider

1 month ago

Artificial intelligence

Anthropic and OpenAI are crawling the web even more and not giving much back

more#web-scraping

Artificial intelligence

fromFuturism

1 month ago

Anthropic Knew the Public Would Be Disgusted by How It Was Destroying Physical Books, Secret Documents Reveal

Anthropic bought, shredded, and scanned millions of used books to train AI, relying on first-sale doctrine and a transformative-use ruling to avoid paying authors.

fromThe Verge

1 month ago

Video game company stock prices dip after Google introduces an AI world-generation tool

The stock prices of some major video game companies, including Take-Two Interactive, Roblox, and Unity, had notable declines on Friday, just a day after Google announced its Project Genie tool that lets users prompt AI to generate interactive experiences, Reuters reports. Take-Two's stock price closed at $220.30 (down 7.93 percent from yesterday), Roblox's closed at $65.76 (down 13.17 percent), and Unity's closed at $29.10 (down 24.22 percent).

Video games

#copyright-infringement

fromEntrepreneur

1 month ago

Intellectual property law

Anthropic Is Being Sued for $3 Billion Over Music Piracy

fromTechCrunch

1 month ago

Intellectual property law

YouTubers sue Snap for alleged copyright infringement in training its AI models | TechCrunch

fromTechCrunch

2 months ago

Intellectual property law

John Carreyrou and other authors bring new lawsuit against six major AI companies | TechCrunch

fromThe Verge

4 months ago

Intellectual property law

Studio Ghibli, Bandai Namco, Square Enix demand OpenAI stop using their content to train AI

fromIPWatchdog.com | Patents & Intellectual Property Law

4 months ago

Apple

Authors Take Page from Anthropic in Alleging Apple Infringed Works by Training AI on Pirated Books

fromFuturism

4 months ago

Artificial intelligence

OpenAI in Danger After Authors Suing It Gain Access to Its Internal Slack Messages

fromEntrepreneur

1 month ago

Intellectual property law

Anthropic Is Being Sued for $3 Billion Over Music Piracy

fromTechCrunch

1 month ago

Intellectual property law

YouTubers sue Snap for alleged copyright infringement in training its AI models | TechCrunch

fromTechCrunch

2 months ago

Intellectual property law

John Carreyrou and other authors bring new lawsuit against six major AI companies | TechCrunch

fromThe Verge

4 months ago

Intellectual property law

Studio Ghibli, Bandai Namco, Square Enix demand OpenAI stop using their content to train AI

fromIPWatchdog.com | Patents & Intellectual Property Law

4 months ago

Apple

Authors Take Page from Anthropic in Alleging Apple Infringed Works by Training AI on Pirated Books

fromFuturism

4 months ago

Artificial intelligence

OpenAI in Danger After Authors Suing It Gain Access to Its Internal Slack Messages

more#copyright-infringement

#internet-archive

fromEngadget

1 month ago

Media industry

Publishers are blocking the Internet Archive for fear AI scrapers can use it as a workaround

fromNieman Lab

1 month ago

Media industry

News publishers limit Internet Archive access due to AI scraping concerns

fromEngadget

1 month ago

Media industry

Publishers are blocking the Internet Archive for fear AI scrapers can use it as a workaround

fromNieman Lab

1 month ago

Media industry

News publishers limit Internet Archive access due to AI scraping concerns

more#internet-archive

fromBuzzFeed

1 month ago

If You Use Gmail, You're Going To Want To Turn Off This 1 Automatic Setting ASAP

For Gmail users, there is an automatic opt-in that may allow Google access to your emailed data (think: your personal and work messages, your attachments) "to train AI models," cybersecurity experts allege. If you don't want this information shared, you need to adjust your settings. In the race for companies to get an ROI on AI, we're already seeing language learning models running out of new, human-generated data to train on.

Intellectual property law

fromIPWatchdog.com | Patents & Intellectual Property Law

1 month ago

Other Barks & Bites for Friday, January 23: USAA Petition on Section 101 Distributed for Conference; Fifth Circuit Says Trade Secret Claimants Must Apportion Damages; TRAIN Act Introduced in House

New U.S. IP developments: TRAIN Act proposes subpoena power for AI training data; courts and agencies advance major trademark, patent, antitrust, and trade-secret rulings.

fromGlobal IP & Technology Law Blog

1 month ago

A Year On from UK Government Consultation on Copyright and Artificial Intelligence

those options range from "option 0", simply doing nothing and leaving UK copyright legislation in its currently uncertain state when it comes to the use of copyright materials to train AI models, through to options which would either require specific consent from rights holders in all cases ("option 1") or allow consent to be assumed by AI developers unless a rights holder objects, subject to developers being transparent about what materials have been used in training ("option 3").

UK politics

Artificial intelligence

fromFuturism

1 month ago

After Being Pillaged By AI Companies, Wikipedia Signs Deal to Get Paid By Them

Wikipedia is licensing its collection of over 65 million articles to major AI companies through a paid Enterprise program to recoup costs and fund operations.

Artificial intelligence

fromAxios

2 months ago

The rise of "web rot"

Older websites persist and degrade search quality and training data, while overall web traffic steadiness masks decline among sites older than five years.

Intellectual property law

fromArs Technica

2 months ago

World's largest shadow library made a 300TB copy of Spotify's most streamed songs

Anna's Archive is offering high-speed, enterprise-level access to scraped LLM training data including unreleased collections, raising concerns about facilitating AI labs and legal exposure.

Music

fromwww.theguardian.com

2 months ago

Activist group says it has scraped 86m music files from Spotify

Anna's Archive claims to have scraped 86 million Spotify tracks and metadata, planning to release them online and potentially accelerate AI training on pirated music.

fromTechCrunch

2 months ago

Adobe hit with proposed class-action, accused of misusing authors' work in AI training | TechCrunch

A proposed class-action lawsuit filed on behalf of Elizabeth Lyon, an author from Oregon, claims that Adobe used pirated versions of numerous books-including her own-to train the company's SlimLM program. Adobe describes SlimLM as a small language model series that can be "optimized for document assistance tasks on mobile devices." It states that SlimLM was pre-trained on SlimPajama-627B, a "deduplicated, multi-corpora, open-source dataset" released by Cerebras in June of 2023.

Artificial intelligence

Miscellaneous

fromeuronews

2 months ago

EU vs. Big Tech: What actions have regulators taken so far?

European regulators are enforcing new AI, digital services, and markets laws to curb Big Tech dominance and protect consumers and creators.

Startup companies

fromThe Verge

2 months ago

Who's making the most money in AI? It's not who you think

Emerging vendors like Mercor and Handshake profit massively by supplying specialized data, engineers, and labeling services to frontier AI labs pursuing AGI.

Intellectual property law

fromTheregister

2 months ago

India's government wants AI companies to pay for content

India proposes blanket training licenses for AI with royalties paid only upon commercialization, set by a government committee and collected via a centralized nonprofit collective.

Intellectual property law

fromTheregister

3 months ago

Really Simple Licensing spec makes AI orgs pay to scrape

Really Simple Licensing (RSL) 1.0 enables machine-readable rules for crawlers, allowing publishers to declare access, processing, and payment terms for web content.

#eu-antitrust

fromFast Company

3 months ago

Miscellaneous

Google faces a new EU antitrust probe over content used for AI Overviews, YouTube

fromComputerworld

3 months ago

EU data protection

European Commission investigates Google's AI training processes

fromThe Verge

3 months ago

Miscellaneous

Google Zero is under investigation by the EU

EU probes Google for allegedly using publisher and YouTube content to boost its AI features without offering compensation or opt-out options, risking anti-competitive harm.

fromwww.theguardian.com

3 months ago

Europe politics

EU opens investigation into Google's use of online content for AI models

The EU is investigating whether Google used web publishers' and YouTube content to train AI unfairly, disadvantaging rival AI developers and content creators.

fromFast Company

3 months ago

Miscellaneous

Google faces a new EU antitrust probe over content used for AI Overviews, YouTube

fromComputerworld

3 months ago

EU data protection

European Commission investigates Google's AI training processes

fromThe Verge

3 months ago

Miscellaneous

Google Zero is under investigation by the EU

fromwww.theguardian.com

3 months ago

Europe politics

EU opens investigation into Google's use of online content for AI models

Miscellaneous

EU launches Google antitrust probe over AI training

fromEngadget

3 months ago

Miscellaneous

EU opens antitrust investigation into Google's AI practices

fromTheregister

3 months ago

Miscellaneous

EU launches Google antitrust probe over AI training

fromEngadget

3 months ago

Miscellaneous

EU opens antitrust investigation into Google's AI practices

more#antitrust

Artificial intelligence

fromTheregister

3 months ago

Publishers say no to AI scrapers, block bots at server level

Millions of websites are blocking AI crawler bots via robots.txt to prevent training-data scraping and reduce non-human server traffic.

Startup companies

fromTechCrunch

3 months ago

Micro1, a Scale AI competitor, touts crossing $100M ARR | TechCrunch

Micro1 grew ARR from roughly $7M to over $100M this year by rapidly recruiting and vetting domain experts to supply human training data for AI labs and enterprises.

fromZDNET

3 months ago

Google denies analyzing your emails for AI training - here's what happened

I contacted Google for comment, and a spokesperson sent me the following statement: "These reports are misleading - we have not changed anyone's settings. Gmail Smart Features have existed for many years, and we do not use your Gmail content for training our Gemini AI model. Lastly, we are always transparent and clear if we make changes to our terms of service and policies."

Privacy professionals

EU data protection

fromwww.dw.com

3 months ago

EU plans to ease GDPR laws and AI constraints in major shift DW 11/18/2025

EU proposals would narrow GDPR protections, enable broader data harvesting for AI, remove cookie consent pop-ups, and shift burden onto users to request data removal.

fromFortune

3 months ago

Cloudflare CEO says Google is abusing its monopoly in search to feed its AI | Fortune

"The great patron of the internet for the last 27 years was Google. The great villain of the internet today is also Google," Prince said. He claimed that in the past, for every two pages that Google crawled to inform its search engine, it would, on average, send one visitor to those sites-traffic that publishers can monetise with advertising.

Artificial intelligence

#copyright-law

fromTechCrunch

3 months ago

Germany news

Court rules that OpenAI violated German copyright law; ordered it to pay damages | TechCrunch

fromwww.theguardian.com

4 months ago

Intellectual property law

Labor rules out giving tech giants free rein to mine copyright content to train AI

fromIPWatchdog.com | Patents & Intellectual Property Law

4 months ago

Intellectual property law

Anthropic Settlement Signals AI Innovation Can Thrive Within Existing Copyright Framework

fromElectronic Frontier Foundation

5 months ago

Intellectual property law

Protecting Access to the Law-and Beneficial Uses of AI

fromwww.theguardian.com

5 months ago

UK politics

Adviser to UK minister claimed AI firms would never have to compensate creatives

fromTechCrunch

3 months ago

Germany news

Court rules that OpenAI violated German copyright law; ordered it to pay damages | TechCrunch

fromwww.theguardian.com

4 months ago

Intellectual property law

Labor rules out giving tech giants free rein to mine copyright content to train AI

fromIPWatchdog.com | Patents & Intellectual Property Law

4 months ago

Intellectual property law

Anthropic Settlement Signals AI Innovation Can Thrive Within Existing Copyright Framework

fromElectronic Frontier Foundation

5 months ago

Intellectual property law

Protecting Access to the Law-and Beneficial Uses of AI

fromwww.theguardian.com

5 months ago

UK politics

Adviser to UK minister claimed AI firms would never have to compensate creatives

more#copyright-law

fromTechzine Global

3 months ago

Wikimedia calls on AI companies to use paid API

Wikimedia has called on AI companies to take responsibility for using Wikipedia content in their language models. This can be achieved by stopping scraping and using the paid API instead. In a blog post, the organization states that artificial intelligence cannot exist without the human knowledge collected and maintained on platforms such as Wikipedia. To maintain that balance, Wikimedia asks developers of generative AI to clearly cite their sources and contribute to the continued existence of the open knowledge project via the paid Wikimedia Enterprise platform.

Artificial intelligence

fromThe Verge

4 months ago

Elon Musk's Grokipedia launches with AI-cloned pages from Wikipedia

Since 2001, Wikipedia has been the backbone of knowledge on the internet. Hosted by the Wikimedia Foundation, it remains the only top website in the world run by a nonprofit. Unlike newer projects, Wikipedia's strengths are clear: it has transparent policies, rigorous volunteer oversight, and a strong culture of continuous improvement. Wikipedia is an encyclopedia, written to inform billions of readers without promoting a particular point of view.

Non-profit organizations

Artificial intelligence

fromABC7 Los Angeles

4 months ago

Elon Musk launches Grokipedia to compete with online encyclopedia Wikipedia

Elon Musk launched Grokipedia, a crowdsourced encyclopedia powered by xAI, presenting itself as a minimalist Wikipedia rival claiming to provide the complete truth.

Artificial intelligence

fromTechCrunch

4 months ago

How AI labs use Mercor to get the data companies won't share | TechCrunch

AI labs hire former senior employees through marketplaces like Mercor to obtain industry workflows and train automation models without corporate data contracts.

fromComputerworld

4 months ago

Canva debuts foundational 'design' model, extends AI tools across its app

Canva has built its own foundational AI model that generates layered designs users can edit more easily. It's one of several generative AI-related features Canva announced Thursday, alongside expanded access to its AI assistant and content generation capabilities across its app. To date, Canva has partnered with a variety of AI model providers for content generation - Black Forest Labs, Google, and OpenAI among them - and it acquired Leonardo AI last year.

Artificial intelligence

#data-scraping

fromIPWatchdog.com | Patents & Intellectual Property Law

4 months ago

Artificial intelligence

Reddit Dubs Perplexity AI and Data Scraping Companies 'Would-Be Bank Robbers'

fromThe Mercury News

4 months ago

Artificial intelligence

Reddit sues AI company Perplexity and others for 'industrial-scale' scraping of user comments

fromAdExchanger

4 months ago

Tech industry

Sour Scrapes; (Anti)-trust The Process | AdExchanger

fromBusiness Insider

4 months ago

Artificial intelligence

Reddit drags Perplexity in a new lawsuit, accusing it of building up a $20 billion company off stolen data

fromIPWatchdog.com | Patents & Intellectual Property Law

4 months ago

Artificial intelligence

Reddit Dubs Perplexity AI and Data Scraping Companies 'Would-Be Bank Robbers'

fromThe Mercury News

4 months ago

Artificial intelligence

Reddit sues AI company Perplexity and others for 'industrial-scale' scraping of user comments

fromAdExchanger

4 months ago

Tech industry

Sour Scrapes; (Anti)-trust The Process | AdExchanger

fromBusiness Insider

4 months ago

Artificial intelligence

Reddit drags Perplexity in a new lawsuit, accusing it of building up a $20 billion company off stolen data

Facebook's new button lets its AI look at photos you haven't uploaded yet

Meta's opt-in camera-roll feature uploads unpublished photos to its cloud, suggests edits, and can use edited or shared images to train its AI.

Silicon Valley

fromBusiness Insider

4 months ago

Scale AI agreed to settle multiple lawsuits from its California contractors

Scale AI agreed to settle four California lawsuits alleging worker misclassification, underpayment, and denied benefits and has stopped hiring California gig workers.

fromZDNET

4 months ago

Your Uber driver has a new side hustle: Training AI for cash

According to Uber, beginning later this year, drivers and couriers who opt into the program can complete "digital tasks" within Uber's Driver app. These tasks can include submitting a video of themselves speaking in their native language, uploading pictures of specific everyday items, or presenting documents written in a different language. After tasks are completed, the earnings will be in the users' balance within 24 hours. Compensation depends on the time commitment to complete tasks and their complexity.

Artificial intelligence

Privacy technologies

fromExchangewire

4 months ago

Verve Study Shows That 75% of Consumers are More Open to Watching Ads for Free Content

Consumers increasingly accept ad-supported content while expressing rising concern about data use, especially for AI training.

fromArs Technica

4 months ago

Inside the web infrastructure revolt over Google's AI Overviews

The new change, which Cloudflare calls its Content Signals Policy, happened after publishers and other companies that depend on web traffic have cried foul over Google's AI Overviews and similar AI answer engines, saying they are sharply cutting those companies' path to revenue because they don't send traffic back to the source of the information. There have been lawsuits, efforts to kick-start new marketplaces to ensure compensation, and more-

Tech industry

Science

fromNature

5 months ago

How stereotypes shape AI - and what that means for the future of hiring

Internet images encode gendered stereotypes: women shown younger and linked to caregiving jobs, men linked to leadership roles, embedding bias in AI training and hiring.

Privacy technologies

fromTechCrunch

5 months ago

Anker offered Eufy camera owners $2 per video for AI training | TechCrunch

Anker paid Eufy users $2 per theft video and encouraged staged recordings to build AI training data, creating privacy and security risks.

fromBusiness Insider

5 months ago

AI has already run out of training data - but there's more waiting to be unlocked, Goldman's data chief says

"We've already run out of data," Neema Raphael, Goldman Sachs' chief data officer and head of data engineering, said on the bank's "Exchanges" podcast published on Tuesday.

Artificial intelligence

Privacy professionals

fromTechCrunch

5 months ago

Anker offered to pay Eufy camera owners to share videos for training its AI | TechCrunch

Companies pay users for camera and call recordings to train AI models, creating value for users but introducing significant privacy and security risks.

#call-recording

fromBusiness Insider

5 months ago

Privacy professionals

Neon, a buzzy app that pays to record your calls for AI training data, goes offline to address a security scandal

fromTechCrunch

5 months ago

Privacy technologies

Neon, the No. 2 social app on the Apple App Store, pays users to record their phone calls and sells data to AI firms | TechCrunch

fromBusiness Insider

5 months ago

Privacy professionals

Neon, a buzzy app that pays to record your calls for AI training data, goes offline to address a security scandal

fromTechCrunch

5 months ago

Privacy technologies

Neon, the No. 2 social app on the Apple App Store, pays users to record their phone calls and sells data to AI firms | TechCrunch

more#call-recording

fromstupidDOPE | Est. 2008

5 months ago

The Future of Content Licensing: How RSL Bridges Publishers and AI | stupidDOPE | Est. 2008

For decades, publishers large and small have created the news, culture, entertainment, and educational resources that shape how society consumes information. Yet in recent years, the rise of artificial intelligence has added a new twist to the ongoing struggle for sustainable publishing. AI companies are building tools capable of generating responses, summaries, and insights trained on vast amounts of web content. The problem? Many publishers see little to no compensation for their role in shaping the data that fuels these systems.

Artificial intelligence

Information security

fromTechCrunch

5 months ago

Exclusive: Neon takes down app after exposing users' phone numbers, call recordings, and transcripts

Neon, an app paying users for call recordings to sell to AI firms, exposed users' phone numbers, recordings, and transcripts through a security flaw.

fromTech.co

5 months ago

How to Stop LinkedIn From Using Your Data to Train Its AI Models

When Is LinkedIn Going to Start Using My Data to Train Its AI Models? LinkedIn has announced that it will start using some of its users' data to train its AI models starting on November 3rd, 2025. Users from the EU, EEA, Switzerland, Canada, and Hong Kong will all be affected. At this stage, US users will not be affected, but this could soon change.

EU data protection

fromFast Company

5 months ago

Publishers are finally going after Google. What happens now?

Media companies have filed so many lawsuits against AI companies over the past two years that the act has become routine. When I report on these in The Media Copilot newsletter, they're often digest items, adding to the pile of publishers who want fair compensation for the content AI labs have ingested to create large language models (LLMs). There are so many that elaborate infographics are required to keep track of them all.

Media industry

Intellectual property law

fromThe Verge

5 months ago

Record labels claim AI generator Suno illegally ripped their songs from YouTube

Major record labels accuse Suno of pirating songs from YouTube to train AI music models, alleging circumvention of YouTube protections and violations of the DMCA.

Business

from24/7 Wall St.

5 months ago

3 Growth Stocks to Buy If You Only Have $10,000

Deploy $10,000 into selective growth stocks, prioritizing firms increasing revenue and margins—like Reddit—expecting AI-driven demand to amplify long-term returns.

Artificial intelligence

fromBusiness Insider

5 months ago

Google's idea for fixing the AI data drought? Cleaning up risky data.

Generative Data Refinement (GDR) rewrites unsafe, toxic, or PII-containing text using pretrained generative models to purify it for AI training.

Artificial intelligence

fromTechCrunch

5 months ago

Micro1, a competitor to Scale AI, raises funds at $500M valuation | TechCrunch

Micro1 raised $35 million Series A at a $500M valuation while rapidly growing ARR to $50M, positioning to supply human-labeled data for AI labs.

Artificial intelligence

fromBusiness Matters

5 months ago

TikTok tops list of most scraped websites as AI training reshapes data priorities

TikTok became the world’s most scraped website in 2025 after a 321% surge, reflecting AI-driven demand for multimodal training data.

[ Load more ]

#ai-training-data#ai-training-data

The Laid-off Scientists and Lawyers Training AI to Steal Their Careers

The laid-off lawyers and PhDs training AI to steal their careers

X adds Grok-powered audio option to long-form articles

Can Meta see your private life through its Ray-Ban smart glasses? What to know

Workers report watching Ray-Ban Meta-shot footage of people using the bathroom

Can Meta see your private life through its Ray-Ban smart glasses? What to know

Workers report watching Ray-Ban Meta-shot footage of people using the bathroom

UK government delays AI copyright rules amid artist outcry

Recursive Resemblance

Your smart TV may be crawling the web for AI

Anthropic Updates Its Crawler Documentation: ClaudeBot, Claude-User & Claude-SearchBot

Most Founders Don't Realize They're Giving Away Their Influence - Here's How to Take It Back

Plaintiffs Propose Plan for Landmark $1.5 Billion Copyright Settlement Process with Anthropic

Music publishers sue Anthropic for $3 billion over 'flagrant piracy'

New York Times reporter files lawsuit against AI companies

Thomson Reuters Tells Appeals Court: ROSS's Copying Was 'Theft, Not Innovation'

ChatGPT violated copyright law by harvesting musicians' lyrics, German court rules

AI firm Stability AI wins High Court case against Getty Images over copyright claims

Plaintiffs Propose Plan for Landmark $1.5 Billion Copyright Settlement Process with Anthropic

Music publishers sue Anthropic for $3 billion over 'flagrant piracy'

New York Times reporter files lawsuit against AI companies

Thomson Reuters Tells Appeals Court: ROSS's Copying Was 'Theft, Not Innovation'

ChatGPT violated copyright law by harvesting musicians' lyrics, German court rules

AI firm Stability AI wins High Court case against Getty Images over copyright claims

Reddit INSIDER sends major vote of confidence after earnings

Reddit INSIDER sends major vote of confidence after earnings

Reddit Launches Legal Action to Block AI Companies from Scraping its Data

Reddit sues Perplexity and others for allegedly scraping millions of user comments

Reddit INSIDER sends major vote of confidence after earnings

Reddit INSIDER sends major vote of confidence after earnings

Reddit Launches Legal Action to Block AI Companies from Scraping its Data

Reddit sues Perplexity and others for allegedly scraping millions of user comments

Daily Tech Insider Maps the AI Arms Race From Silicon Valley to the Moon

Amazon May Launch Marketplace for Publishers to Sell Content to AI Firms

Amazon may launch a marketplace where media sites can sell their content to AI companies | TechCrunch

Online Media Brands Hope a New Protocol Will Stop Unwanted AI Crawlers

Amazon may launch a marketplace where media sites can sell their content to AI companies | TechCrunch

Online Media Brands Hope a New Protocol Will Stop Unwanted AI Crawlers

Increase of AI bots on the Internet sparks arms race

Anthropic and OpenAI are crawling the web even more and not giving much back

Increase of AI bots on the Internet sparks arms race

Anthropic and OpenAI are crawling the web even more and not giving much back

Anthropic Knew the Public Would Be Disgusted by How It Was Destroying Physical Books, Secret Documents Reveal

Video game company stock prices dip after Google introduces an AI world-generation tool

Anthropic Is Being Sued for $3 Billion Over Music Piracy

YouTubers sue Snap for alleged copyright infringement in training its AI models | TechCrunch

John Carreyrou and other authors bring new lawsuit against six major AI companies | TechCrunch

Studio Ghibli, Bandai Namco, Square Enix demand OpenAI stop using their content to train AI

Authors Take Page from Anthropic in Alleging Apple Infringed Works by Training AI on Pirated Books

OpenAI in Danger After Authors Suing It Gain Access to Its Internal Slack Messages

Anthropic Is Being Sued for $3 Billion Over Music Piracy

YouTubers sue Snap for alleged copyright infringement in training its AI models | TechCrunch

John Carreyrou and other authors bring new lawsuit against six major AI companies | TechCrunch

Studio Ghibli, Bandai Namco, Square Enix demand OpenAI stop using their content to train AI

Authors Take Page from Anthropic in Alleging Apple Infringed Works by Training AI on Pirated Books

OpenAI in Danger After Authors Suing It Gain Access to Its Internal Slack Messages

Publishers are blocking the Internet Archive for fear AI scrapers can use it as a workaround

News publishers limit Internet Archive access due to AI scraping concerns

Publishers are blocking the Internet Archive for fear AI scrapers can use it as a workaround

News publishers limit Internet Archive access due to AI scraping concerns

If You Use Gmail, You're Going To Want To Turn Off This 1 Automatic Setting ASAP

Other Barks & Bites for Friday, January 23: USAA Petition on Section 101 Distributed for Conference; Fifth Circuit Says Trade Secret Claimants Must Apportion Damages; TRAIN Act Introduced in House

A Year On from UK Government Consultation on Copyright and Artificial Intelligence

After Being Pillaged By AI Companies, Wikipedia Signs Deal to Get Paid By Them

The rise of "web rot"

World's largest shadow library made a 300TB copy of Spotify's most streamed songs

Activist group says it has scraped 86m music files from Spotify

Adobe hit with proposed class-action, accused of misusing authors' work in AI training | TechCrunch

EU vs. Big Tech: What actions have regulators taken so far?

Who's making the most money in AI? It's not who you think

India's government wants AI companies to pay for content

Really Simple Licensing spec makes AI orgs pay to scrape

Google faces a new EU antitrust probe over content used for AI Overviews, YouTube

European Commission investigates Google's AI training processes

Google Zero is under investigation by the EU

EU opens investigation into Google's use of online content for AI models

Google faces a new EU antitrust probe over content used for AI Overviews, YouTube

European Commission investigates Google's AI training processes

Google Zero is under investigation by the EU

#ai-training-data
#ai-training-data