Google on Wednesday announced Project Mariner, a research prototype from its DeepMind division, the first-ever AI agent capable of performing actions on the web. Gemini-powered agents take control of your Chrome browser, allowing you to use and navigate websites like a human by moving the cursor around the screen, clicking buttons, and filling out forms.
Google says the company is starting by releasing the AI agent to a small number of pre-selected testers on Wednesday.
Google continues to experiment with new ways for Gemini to read, summarize, and use websites. A Google executive told TechCrunch that this is part of a “fundamentally new UX paradigm shift.” The idea is that users will no longer interact directly with websites, but instead with generative AI systems that will do it for them.
Project Mariner Overview. Image credit: Google
These changes will impact millions of businesses, from publishers like TechCrunch to retailers like Walmart, that have traditionally relied on Google to get real people to visit and engage with their websites. There is a possibility.
In a TechCrunch demo, Google Labs Director Jaclyn Konzelmann showed how Project Mariner works.
When you set up an AI agent using the Chrome extension, a chat window appears on the right side of your browser. You can tell the agent to do something like, “Create a shopping cart from grocery stores based on this list.”
Here's what Project Mariner looks like when used. Image credit: Google
From there, the AI agent navigated to the grocery store's website (in this case, Safeway), searched for the item, and added it to a virtual shopping cart. One thing you'll quickly notice is how slow the agent is. There was a delay of approximately 5 seconds between each cursor movement. Occasionally, the agent would stop the task, return to the chat window, and ask for clarification on a particular item (such as the number of carrots).
Google agents aren't supposed to enter credit card numbers or billing information, so they can't check out. Additionally, Project Mariner will not accept your cookies or sign you to any terms of use. Google says it intentionally doesn't allow agents to do these things to give users more control.
Behind the scenes, Google's agent takes a screenshot of the browser window, which requires the user to agree to terms of service, and sends it to Gemini in the cloud for processing. Gemini then sends instructions back to your computer to navigate the web page.
Project Mariner can also be used for tasks that currently require users to click around the web, such as finding flights and hotels, buying household goods, and finding recipes.
One big caveat is that Project Mariner only works on the most active tab in your Chrome browser. This means that you cannot use your computer for anything else while the agent is working in the background. You need to watch Gemini click slowly. Koray Kavukcuoglu, chief technology officer at Google DeepMind, says this was a very intentional decision to let users know what Google's AI agents are doing.
“because [Gemini] is currently taking action on behalf of users, but it's important to do this in stages,” Kavukcuoglu said in an interview with TechCrunch. “It's complementary. You can use the website as an individual, but your agent can also do everything you do on the website.”
Website owners may be relieved to hear that Google's AI agent works on their computer screens. This means that publishers and retailers continue to pay attention to your pages. However, Google's AI agents may make users less involved with the websites they visit, and one day they may no longer need to use those websites.
“[Project Mariner] This is a fundamentally new UX paradigm shift that we are witnessing now,” Konzelmann told TechCrunch. “To change the way users interact with the web, we need to figure out what's the right way to do all of this, and how publishers can create user and agent experiences in the future.”
Besides Project Mariner, Google on Wednesday also announced several other AI agents for more specific tasks.
Deep Research, an AI agent, aims to help users explore complex topics by creating multi-step research plans. It seems to compete with OpenAI's o1, which also allows multi-step inference. However, a Google spokesperson said the agent is not designed to solve math or logical reasoning problems, write code, or perform data analysis. The AI agent is being deployed in Gemini Advanced today and will be introduced to Gemini apps in 2025.
When you have a tough or big question, Deep Research creates a multi-step action plan to answer it. Once you approve the plan, Deep Research takes a few minutes to answer your questions, search the web, and generate a lengthy report on the results.
Another new AI agent from Google, Jules, aims to help developers with their coding efforts. Integrated directly into your GitHub workflow, Jules allows you to view your existing work and make changes directly in GitHub. Jules is rolling out to a select group of beta testers today and is expected to be available in late 2025.
Finally, Google DeepMind says it is building on its long history of developing gameplay AI to develop AI agents to help you navigate video games. Google is working with game developers such as Supercell to test Gemini's ability to interpret game worlds such as Clash of Clans. Google hasn't announced a release date for the prototype, but says the research is helping it build AI agents that help navigate the physical world as well as the virtual world.
It's unclear when Project Mariner will be rolled out to Google's large user base, but when it does, these agents will have a significant impact on the broader web. The web was designed for human use, but Google's AI agents could change that standard.