On Thursday, OpenAI released a chatbot that effectively costs $200 a month, and the AI community had no idea what to make of it.
The company's new ChatGPT Pro plan grants access to “o1 Pro Mode,” which OpenAI says “uses more compute to get the best answers to your toughest questions. ” That means. The o1 Pro mode, an improved version of OpenAI's o1 inference model, should be able to answer science, math, and coding questions more “robustly” and “comprehensively,” OpenAI said.
Almost immediately, people began requesting to draw unicorns.
I asked ChatGPT o1 Pro Mode to create a unicorn SVG.
(This is the model you can access for $200 per month) pic.twitter.com/h9HwY3aYwU
— Rammy (@rammydev) December 5, 2024
and design a “crab-based” computer.
We have finally introduced o1-pro to the ultimate use case. pic.twitter.com/nX4JAjx71m
— Ethan Mollick (@emollick) December 6, 2024
And he waxes poetic about the meaning of life.
I just signed up for a $200/month subscription to OpenAI.
Please reply to the question. Reposting to this thread. pic.twitter.com/oTQxbPxnoP
— Garrett Scott 🕳 (@thegarrettscott) December 5, 2024
However, many at X seemed unconvinced that the answer to o1 pro mode was at the $200 level.
“Has OpenAI shared any concrete examples of prompts that fail on regular o1 but succeed on o1-pro?” asked British computer scientist Simon Willison. “I'd like to see one concrete example of its benefits.”
That's a fair question. After all, this is the world's most expensive chatbot subscription. The service also offers other benefits, such as lifting rate limits and unlimited access to OpenAI's other models. But $2,400 a year isn't a lot of change, and the value proposition remains unclear, especially for o1 Pro mode.
It didn't take long to find examples of failure. O1 Pro mode struggles with Sudoku, tripped up by optical illusion jokes that are obvious to any human being.
o1 and o1-pro both failed here. Probably due to visual limitations (same for Sudoku puzzles) https://t.co/mAVK7WxBrq pic.twitter.com/O9boSv7ZGt
— Tibor Blaho (@btibor91) December 5, 2024
OpenAI's internal benchmarks show that o1 pro mode performs only slightly better than standard o1 on coding and math problems.
Image credit: OpenAI
OpenAI performed a “more rigorous” evaluation on the same benchmark to demonstrate the consistency of o1 Pro mode. The model is considered to have solved the problem only if it gets it right 4 out of 4 times. However, even these tests did not show any dramatic improvement.
Image credit: OpenAI
OpenAI CEO Sam Altman once wrote that OpenAI is: path “Toward Too Cheap and Unmeasurable Intelligence” Thursday was forced to clarify multiple times that ChatGPT Pro is not suitable for most people.
“Most users will be very happy with o1. [ChatGPT] Plastia! ” he said in X. “Almost everyone will get the most benefit from using the free or plus tier.”
So who is it for? Pay $200 a month to answer toy-like questions like “Write a three-paragraph essay about strawberries without using the letter 'e'” or “Solve this Math Olympiad problem.” Are there really people out there who are willing to part with their hard-earned cash with little guarantee that a standard o1 won't be able to answer the same question satisfactorily?
We asked Ameet Talwalkar, associate professor of machine learning at Carnegie Mellon University and venture partner at Amplify Partners, for his thoughts. “Increasing the price 10x seems like a big risk to me,” he told TechCrunch via email. “I think we'll get a better sense of the demand for this feature in just a few weeks.”
Guy van den Block, a computer scientist at UCLA, was more blunt in his assessment. “I don't know if this price is reasonable, and I don't know if expensive inference models will become the norm,” he told TechCrunch.
o1 is “better than most humans at most tasks” because, yes, humans only exist in amnesiac, disembodied multi-turn chat interfaces https://t.co /zbLY2BG5pQ
— Aidan McClau (@aidan_mclau) December 6, 2024
To be generous, this is a marketing failure. Explaining that o1 Pro mode is the best at solving “the toughest problems” doesn't really go over well with prospects. There's also no ambiguity about how models can “think longer” and demonstrate “intelligence.” As Willison points out, without concrete examples of this purported improved functionality, it's hard to justify paying more, let alone 10 times the price.
As far as I know, the target audience is experts in their field. OpenAI said it plans to grant free access to ChatGPT Pro, including o1 Pro mode, to a small number of medical researchers at “major institutions.” Mistakes are very important in medical practice, and as Bob McGrew, former Principal Research Officer at OpenAI, pointed out in X, improved reliability is probably the main unlock for o1 pro mode.
I've been playing around with o1 and o1-pro a bit.
They are very good and a bit unusual. Also, in most cases, they are not suitable for most people. To extract value from it, you need a specific hard problem to solve. But if you have these problems, this is a very big problem.
— Ethan Mollick (@emollick) December 5, 2024
McGrew also considered o1 pro mode to be an example of what he called “intelligence overhang.” Due to the fundamental limitations of simple text-based interfaces, users (and perhaps model authors) don't know how to get value from the “extra intelligence.” . As with OpenAI's other models, the only way to interact with o1 Pro Mode is through ChatGPT, which McGrew points out is not perfect.
However, it's also true that $200 raises expectations. Judging by its early reputation on social media, ChatGPT Pro is anything but great.