Google is building a new team to work on AI models that can simulate the physical world.
Sora, one of the co-leaders of OpenAI's video generator, announced in a post on X that Tim Brooks, who left in October to join Google's AI lab Google DeepMind, will lead the new team. Google DeepMind.
“DeepMind has ambitious plans to create large-scale generative models that simulate the world,” Brooks wrote Monday morning. “We are recruiting a new team for this mission.”
According to the job listing linked in Brooks' post, the new modeling team will collaborate with Google's Gemini, Veo, and Genie teams, build on their work to tackle “significant new problems,” and transform models into ” We plan to expand to the highest level of computing. Gemini is Google's flagship AI model series for tasks like image analysis and text generation, while Veo is Google's own video generation model.
As for Genie, this is Google's take on the world model, an AI that can simulate games and 3D environments in real time. Previewed in December, Google's latest Genie model can generate a huge variety of playable 3D worlds.
An interactive game-like world generated by DeepMind's Genie 2 model. Image credit: DeepMind
“We believe in scaling [AI training] “Video and multimodal data processing is on the critical path to artificial general intelligence,” one job description reads. Artificial general intelligence (AGI) generally refers to AI that can perform any task that a human can do. “World models will power numerous areas, including visual reasoning and simulation, embodied agent planning, and real-time interactive entertainment.”
According to the description, Brooks' new team will develop “real-time interactive generation” tools based on the model they built, and will study how to integrate the model with existing multimodal models such as Gemini.
Many startups and big tech companies are chasing the global model, including influential AI researcher Fei-Fei Li's World Labs, Israeli startups Decart, and Odyssey. They believe that one day world models can be used to create interactive media such as video games and movies, or to run realistic simulations such as robot training environments.
Join Tim and the DeepMind team to work on large-scale world simulation models 🙂
On the critical path to AGI. https://t.co/4Zuju5eMHb
— Logan Kilpatrick (@OfficialLoganK) January 6, 2025
But creators have mixed feelings about the technology.
A recent Wired investigation found that game studios like Activision Blizzard, which has laid off large numbers of employees, are cutting corners, increasing productivity, and leveraging AI to compensate for layoffs. Additionally, a 2024 study commissioned by the Animation Guild, the union representing Hollywood animators and cartoonists, predicted that by 2026, more than 100,000 U.S.-based film, television, and animation jobs will be replaced by AI. It is estimated that it will be destroyed.
Some startups in the emerging world of modeling, like Odyssey, are committed to working with creative professionals rather than replacing them. We'll have to wait and see if Google will follow suit.
Copyright issues are also unresolved. Some of the world models appear to have been trained on clips of video game playthroughs, and the companies developing these models could be subject to lawsuits if the videos are not licensed.
Google, which owns YouTube, claims it has permission to train models on YouTube videos under the platform's terms of service. However, the company has not disclosed which specific videos it is procuring for training purposes.