Bluesky users debate plans around user data and AI training | TechCrunch
Briefly

Bluesky's recent proposal on GitHub aims to give users control over whether their posts and data can be scraped for generative AI training and public archiving. CEO Jay Graber emphasized that the platform is trying to establish a new standard that parallels the function of robots.txt files for websites. Users can set preferences regarding their data's usage, but reaction was mixed, with some expressing alarm over potential data sharing that contradicts Bluesky’s previous stance on user privacy.
Oh, hell no! The beauty of this platform was the NOT sharing of information. Especially gen AI. Don't you cave now.
Graber stated that generative AI companies are already scraping public data from across the web, including from Bluesky, since everything on Bluesky is public.
Bluesky is trying to create a new standard to govern that scraping, similar to the robots.txt file that websites use to communicate permissions.
The proposed standard would provide a machine-readable format that good actors are expected to abide by, carrying ethical weight but not being legally enforceable.
Read at TechCrunch
[
|
]