When OpenAI first demoed its eerily realistic, near real-time “advanced voice mode” for its AI-powered chatbot platform ChatGPT in May, the company said it would roll out the feature to ChatGPT's paid users within a few weeks.
A few months later, OpenAI said it needed more time.
In a post on its official Discord server, OpenAI said it had planned to roll out an alpha version of the advanced voice mode to a small number of ChatGPT Plus users in late June, but persistent issues forced it to postpone the launch until sometime in July.
“For example, we are improving our model's ability to detect and reject certain content,” OpenAI wrote. “We're also working to improve the user experience and preparing our infrastructure to scale to millions of people while maintaining real-time responsiveness. As part of our iterative deployment strategy, we'll start with an alpha with a small number of users, gather feedback, and expand based on our learnings.”
OpenAI said the advanced voice mode may not be released to all ChatGPT Plus customers until the fall, depending on whether it passes certain internal safety and reliability checks, but the delay won't affect the rollout of new video and screen sharing features that were demoed separately at OpenAI's spring press event.
These features include the ability to solve math problems when given an image of the problem, explaining various settings menus on your device, and more. They were designed to work not only with ChatGPT on your phone, but also with desktop clients such as the macOS app that is now available to all ChatGPT users today.
“ChatGPT's advanced voice modes can understand and respond to emotions and non-verbal cues, bringing you closer to a real-time, natural conversation with an AI,” OpenAI wrote. “Our mission is to deliver these new experiences thoughtfully.”
Onstage at the launch event, OpenAI employees demonstrated ChatGPT responding nearly instantly to requests, such as solving a math problem on a piece of paper held in front of a researcher's smartphone camera.
OpenAI's advanced voice mode sparked considerable controversy due to the similarity of the default “Sky” voice to that of actress Scarlett Johansson. Johansson later released a statement saying that she had hired lawyers to research the voice and get precise details about how it was developed, and that she had refused multiple requests from OpenAI to license her voice to ChatGPT.
OpenAI denied that it had used Johansson's voice or a similar voice without permission, but later removed the voice in question.