ByteDance's next-gen AI model can generate clips based on text, images, audio, and video

"In a blog post, ByteDance - the China-based company behind TikTok - says Seedance 2.0 supports prompts that combine text, images, video, and audio. The company claims it "delivers a substantial leap in generation quality," offering improvements in generating complex scenes with multiple subjects and its ability to follow instructions. Users can refine their text prompts by feeding Seedance 2.0 up to nine images, three video clips, and three audio clips."

"AI-powered generation models have only gotten more advanced within the past year, with Google Veo 3 adding the ability to generate audio-supported clips, and OpenAI launching Sora 2 along with a new app that allows users to create videos with "hyperreal motion and sound." The AI startup, Runway, has also released a new version of its AI model that it claims has "unprecedented" accuracy."

Seedance 2.0 supports multimodal prompts that combine text, images, video, and audio, and accepts up to nine images, three video clips, and three audio clips. The model can generate up to 15-second clips with audio while accounting for camera movement, visual effects, and motion, and can reference text-based storyboards. ByteDance reports improved generation quality for complex scenes with multiple subjects and better instruction following. Competitors have added similar features over the past year: Google Veo 3 added audio-supported clips, OpenAI launched Sora 2 and a new app for hyperreal motion and sound, and Runway released a version claiming unprecedented accuracy.

#seedance-20 #multimodal-video-generation #camera-movement-and-vfx #ai-model-advancements

Read at The Verge

Unable to calculate read time

Collection

[

...

]

ByteDance's next-gen AI model can generate clips based on text, images, audio, and videoByteDance's next-gen AI model can generate clips based on text, images, audio, and video Briefly

ByteDance's next-gen AI model can generate clips based on text, images, audio, and video
ByteDance's next-gen AI model can generate clips based on text, images, audio, and video
Briefly