Oxford University scholars warn that using synthetic data to train gen AI can drastically degrade model accuracy, potentially rendering them useless.
Model collapse, a degenerative process described by Ilia Shumailov's team, results in generative models generating data that pollutes subsequent training sets, leading to misperception of reality.
Over generations, models fed synthetic data lose track of less-common facts, becoming generic and producing irrelevant outputs that turn into gibberish.
Collection
[
|
...
]