Humans flunk the Turing test for voices as bots get chattier
Briefly

Humans flunk the Turing test for voices as bots get chattier
"Think you can distinguish between a human voice and a robot? Think again, because the numbers are starting to say otherwise. Researchers at Queen Mary University of London and University College London found that people can no longer reliably distinguish between genuine speech and cloned AI voices. Their study, published in open-access journal PLOS One, found that when people were played recordings of real people together with AI-generated versions of the same voices, their judgments were little better than random chance."
"The team, led by psychologist Nadine Lavan, tested 80 audio samples: half human, half synthetic. The fully synthetic AI voices - that is, those generated entirely by text-to-speech models rather than trained on recordings of a real person - were easier to spot, with "only" 41 percent mistaken for human. But when the voices were cloned from actual people using a few minutes of recorded speech, 58 percent of them fooled listeners into thinking they were human."
""We show that, under certain conditions, it is not possible for human listeners to accurately discriminate between AI-generated voices and genuine recordings," the authors wrote, adding that the results "demonstrate how voices generated from limited amounts of input data can reach a similar level of human likeness to real recordings of human speakers.""
Listeners were played 80 audio samples, half human and half synthetic, and judgments were little better than random chance. Fully synthetic text-to-speech voices were easier to spot, with only 41 percent mistaken for human, while cloned voices produced from under five minutes of recorded speech fooled listeners 58 percent of the time. Subjects identified real voices correctly only 62 percent of the time, producing no meaningful sensitivity between real and cloned voices. Off-the-shelf software from ElevenLabs created the clones. Some cloned voices were perceived as more trustworthy and more dominant than corresponding human recordings, despite lacking hyperrealism.
Read at Theregister
Unable to calculate read time
[
|
]