New features and firsts for OpenAI. — Sora, a video generation model, can perform some truly impressive cinematic feats.But the model is uniform. more OpenAI has more capabilities than originally assumed, at least judging from a technical paper published this evening.
Co-authored by a number of OpenAI researchers, the paper, entitled “Video Generation Models as World Simulators,” peels back the curtain on key aspects of Sora's architecture, including how Sora can handle arbitrary resolutions and aspect ratios. It reveals that it can generate videos (up to 1080p). According to the paper, Sora can perform a variety of image and video editing tasks, from creating looping videos, to extending videos forward and backward in time, to changing the background of existing videos.
But what's most interesting to me is Sora's ability to “simulate the digital world,” as OpenAI's co-authors put it. In the experiment, OpenAI unleashed Sora onto Minecraft and had him render the world and its dynamics (including physics) while controlling the player.
So how is Sora able to do this? As Nvidia senior researcher Jim Fan (via Quartz) observed, Sora is also more of a “data-driven physics engine” than a creative one. He doesn't just generate a single photo or video, he determines the physics of each object in the environment and renders the photo or video (and in some cases his interactive 3D world) based on these calculations. To do.
“These capabilities suggest that the continued scaling of video models is a promising path toward the development of highly capable simulators of the physical and digital worlds and the objects, animals, and people that live within them.” “,” the co-authors wrote.
Now, Sora's usual limitations apply to the realm of video games. This model is unable to accurately approximate the physics of fundamental interactions such as glass shattering. And even if there is interaction, can For example, you want to render a person eating a hamburger, but you can't render the bite marks.
Still, if I'm reading the paper correctly, it seems like Sora could pave the way for more realistic, perhaps photorealistic, procedurally generated games. It's both exciting and terrifying (think of the impact of deepfakes) – perhaps this is why OpenAI chose to gate Sora behind its network. very It is currently a restricted access program.
I hope I can learn more sooner.