An investigation from Wired and Proof News found that this dataset, called YouTube Subtitles, contains transcripts from over 173,000 YouTube videos on more than 48,000 different channels.
This AI scraping issue extends beyond YouTube, with initiatives like the app Cara and the University of Chicago's Nightshade aiming to protect artists and limit AI capabilities in analyzing content.
Creators face challenges in safeguarding their content against AI scraping, raising concerns about privacy and intellectual property protection in the tech industry.
Collection
[
|
...
]