"Model collapse" threatens to kill progress on generative AIs
Briefly

"If you could get all the data that you needed off the web, that would be fantastic. In reality, the web is so noisy and messy that it's not really representative of the data that you want. The web just doesn't do everything we need."
"On the surface, this does seem like a great idea: just have your current generative AI churn out loads of text, images, or videos - whatever you need - and then use that new additional data to train your new model. No need to worry about running out of content or running up against the demands of content creators."
"Developers need ever more high-quality training data - but now that publishers know their content is being used to train AIs, they've started requesting money for it and, in some cases, suing developers for using it without permission."
Read at Big Think
[
|
]