Google is upgrading its visual search app Lens to answer questions about your surroundings in near real-time.
English-speaking Android and iOS users with the Google app installed can now start capturing video via Lens and ask questions about objects of interest in the video.
Lou Wang, Lens' director of product management, said the feature uses a “customized” Gemini model to understand the video and associated questions. Gemini is Google's family of AI models that powers numerous products across the company's portfolio.
“Suppose you want to know more about some interesting fish,” Wang said at a press conference. “[Lens will] Create an overview explaining why they swim in circles, as well as more resources and helpful information. ”
To access Lens' new video analytics features, you must sign up for Google's Search Labs program and opt-in to Labs' “AI Overview and More” experimental features. In the Google app, holding down your phone's shutter button activates the Lens' video capture mode.
When you ask a question while recording a video, Lens links to answers provided by AI Summaries, a feature in Google Search that uses AI to summarize information on the web.
Image credit: Google
According to Wang, Lens uses AI to determine which frames in the video are the most “interesting,” salient, and above all relevant to the question, and uses these to determine which frames are most “interesting,” salient, and, above all, relevant to the question. “Justify” your answer.
“This all comes from looking at how people are currently trying to use things like lenses,” Wang said. “If you ask these questions and lower the bar to help satisfy people's curiosity, people will understand this problem naturally.”
The Lens video launch comes on the heels of a similar feature that Meta previewed for its Ray-Ban Meta AR glasses last month. Meta plans to introduce real-time AI video capabilities to the glasses, allowing wearers to ask questions about things around them (e.g., “What type of flower is this?”).
OpenAI also unveiled a feature that allows its Advanced Voice Mode tool to understand videos. Eventually, you'll be able to analyze videos in real time and respond in context with advanced voice modes (a premium feature of ChatGPT).
Apparently, aside from the fact that Lens is asynchronous (you can't chat in real time) and the assumption that the video feature works as advertised, Google has beaten both companies to the punch. There was no live demo shown at the press conference, and Google has a history of over-promising when it comes to AI capabilities.
Apart from video analysis, Lens can now search for images and text at once. English-speaking users, including those who aren't registered with Labs, can launch the Google app, press and hold the shutter button to take a photo, and ask a question aloud.
Finally, Lens is adding new e-commerce-specific features.
Starting today, when Lens on Android or iOS recognizes a product, it displays information about that product, including price, deals, brand, reviews, and availability. Product ID works for uploaded or newly snapped photos (not videos) and is currently limited to some countries and certain shopping categories like electronics, toys, and beauty. .
Image credit: Google
“Let's say you see a backpack and you like it,” Wang said. “Use Lens to identify the product and instantly see the details you care about.”
There is also an advertising element to this. Google says results pages for products identified by Lens will also show “relevant” shopping ads that include options and prices.
Why advertise on Lens?According to Google, approximately 4 billion Lens searches are related to shopping each month. For tech giants whose lifeblood is advertising, this is a lucrative opportunity they can't afford to miss.