AI has already run out of training data - but there's more waiting to be unlocked, Goldman's data chief says

""We've already run out of data," Neema Raphael, Goldman Sachs' chief data officer and head of data engineering, said on the bank's "Exchanges" podcast published on Tuesday."

"'I think the real interesting thing is going to be how previous models then shape what the next iteration of the world is going to look like in this way,' Raphael said."

"'I think from a consumer world model, I think it's interesting we've definitely in the synthetic sort of explosion of data. But from an enterprise perspective, I think there's still a lot of juice I'd say to be squeezed in that,' he said."

AI development is encountering a shortage of fresh training data, prompting shifts in how new systems are built. Developers increasingly use synthetic data—machine-generated text, images, and code—for virtually unlimited supply, but synthetic sources risk introducing low-quality outputs into models. Some models may be trained on outputs from previous models rather than wholly new data. Enterprise environments contain large, underutilized proprietary datasets, including trading flows and client interactions, that can provide valuable, higher-quality signals. Harnessing proprietary corporate data could mitigate public-web data depletion and enhance the value and performance of AI tools for firms.

#ai-training-data #synthetic-data #proprietary-datasets #enterprise-data

Read at Business Insider

Unable to calculate read time

Collection

[

...

]

AI has already run out of training data - but there's more waiting to be unlocked, Goldman's data chief saysAI has already run out of training data - but there's more waiting to be unlocked, Goldman's data chief says Briefly

AI has already run out of training data - but there's more waiting to be unlocked, Goldman's data chief says
AI has already run out of training data - but there's more waiting to be unlocked, Goldman's data chief says
Briefly