OpenAI has finally released the real-time video capabilities of ChatGPT, which we demoed about seven months ago.
The company said during a livestream Thursday that ChatGPT's human-like conversation feature, Advanced Voice Mode, is coming to life. The ChatGPT app allows users with a ChatGPT Plus, Team, or Pro subscription to point their phone at an object and have ChatGPT respond in near real-time.
Advanced voice with vision mode also lets you understand what's on your device's screen through screen sharing. For example, you can explain the various settings menus or suggest math problems.
To access advanced audio modes using Vision, tap the audio icon next to the ChatGPT chat bar, then tap the video icon in the bottom left to start the video. To share your screen, tap the three-dot menu and select Share Screen.
OpenAI says the rollout of Advanced Voice Mode with Vision will begin today and be completed next week. However, not all users have access. OpenAI says the feature won't be available to ChatGPT Enterprise and Edu subscribers until January, and there won't be a timeline for ChatGPT users in the EU, Switzerland, Iceland, Norway, or Liechtenstein.
In a recent demo on CNN's 60 Minutes, OpenAI president Greg Brockman used an advanced audio mode to quiz Anderson Cooper visually on his anatomy skills. When Cooper drew body parts on the blackboard, ChatGPT could “understand” what he was drawing.
Image credit: OpenAI
“The location is accurate,” said ChatGPT. “The brain is right there in the head. In terms of shape, it's a good start. The brain is almost oval-shaped.”
However, in the same demo, the advanced voice mode with vision made a mistake on a geometry problem. This suggests that they are prone to hallucinations.
Advanced voice mode with vision has been postponed many times. This is reportedly partly due to OpenAI announcing this feature long before it was commercialized. In April, OpenAI promised to roll out Advanced Voice Mode to users “in the coming weeks.” A few months later, the company said it needed more time.
When Advanced Voice Mode finally arrived for some ChatGPT users in early fall, it lacked a visual analytics component. Prior to today's release, OpenAI has been focused on bringing the audio-only Advanced Voice Mode experience to additional platforms and users within the EU.
OpenAI today launched a festive “Santa Mode” that adds Santa’s voice as a preset voice in ChatGPT, in addition to the visionary Advance Voice mode. Users can find the snowflake icon by tapping or clicking the snowflake icon next to the prompt bar in the ChatGPT app.