How synthetic data trains AI to solve real problems
Briefly

How synthetic data trains AI to solve real problems
"As artificial intelligence researchers exhaust the supply of real data on the web and in digitized archives, they are increasingly turning to synthetic data, artificially generated examples that mimic real ones. But that creates a paradox. In science, making up data is a cardinal sin. Fake data and misinformation are already undermining trust in information online. So how can synthetic data possibly be good? Is it just a polite euphemism for deception?"
"As a machine learning researcher, I think the answer lies in intent and transparency. Synthetic data is generally not created to manipulate results or mislead people. In fact, ethics may require AI companies to use synthetic data: Releasing real human face images, for example, can violate privacy, whereas synthetic faces can offer similar benefit with formal privacy guarantees. There are other reasons that help explain the growing use of synthetic data in training AI models."
AI-powered features such as night-mode photography can be trained on synthetic nighttime images, computer-generated scenes that were never actually photographed. Researchers increasingly turn to synthetic data as real-world datasets on the web and in archives become depleted. Synthetic examples can mimic real data while addressing privacy, cost, safety, and scarcity challenges. Synthetic data can replace sensitive human images to provide formal privacy guarantees. Simulated data enables representation of rare or risky scenarios — for example, storms or unpaved roads for self-driving cars — that are expensive or dangerous to collect in reality. Intent and transparency guide ethical use of synthetic data.
Read at Fast Company
Unable to calculate read time
[
|
]