Traditional video methods [are a] brute-force approach to pixel generation, where you're trying to squeeze motion in a couple of frames to create the illusion of movement, but the model actually doesn't really know or reason about what's going on in that scene, Previous video-generation models had physics that were unlike the real world, he added, which general-purpose world model systems help to address.