
"The folks over at YouTube are putting it to good use with a focus on accessibility and realism. So, what's next? Making the lips move naturally to the tune of any language, even if the speaker in the video doesn't speak it. Building on the auto-dubbing feature that was launched last year, the team has now come up with the new AI-powered lip sync feature."
"Machine-translated audio has improved dramatically over the past few quarters, and it now almost sounds natural. Audio overviews in Google's NotebookLM are a great example. But when it comes to videos, they fall flat because the lip movement simply doesn't match what the speaker is saying with a translated version of the script. It's pretty jarring and off-putting. The AI-powered lip sync feature wants to overcome that audio-visual dissonance. And from the samples that I've seen so far, they feel uncannily natural."
The video content industry faces an inflection point where AI amplifies creators' abilities while raising misinformation concerns. YouTube emphasizes accessibility and realism by extending auto-dubbing with an AI-powered lip sync that alters on-screen pixels to match translated audio. Machine-translated audio has advanced to almost natural-sounding levels, but mismatched lip movement creates jarring audio-visual dissonance in videos. The lip sync system requires a custom tech stack and a 3D understanding of lip shapes, teeth, posture, and face to preserve conversational tone and nuanced speech while synchronizing realistic lip movements across languages. Early samples appear uncannily natural and build on auto-dubbing's wide adoption.
Read at Digital Trends
Unable to calculate read time
Collection
[
|
...
]