AI labs on the path to superintelligent systems are beginning to realize they may have to take a detour.
According to several AI investors, founders, and CEOs who spoke to TechCrunch, the “laws of AI scaling” — the techniques and expectations labs have been using to improve the power of their models over the past five years — are: The company is currently showing signs of declining profits. Their sentiment reflects recent reports that model improvements within major AI labs are slower than in the past.
Everyone agrees that you can't pre-train large language models while using more compute and more data and expect them to turn into some kind of omniscient digital god. It seems there is. It may sound obvious, but these scaling laws were a key factor in developing and refining ChatGPT, and are likely leading many CEOs to believe that AGI will be here in just a few years. It seems that this influenced him to make some bold predictions.
Ilya Sutskever, co-founder of OpenAI and Safe Super Intelligence, told Reuters last week that “everyone is looking for the next thing” to extend AI models. Earlier this month, a16z co-founder Marc Andreessen said on a podcast that AI models now seem to be converging to the same upper limit of their capabilities.
But now, almost immediately after these worrying trends began to emerge, AI CEOs, researchers, and investors are already declaring that we are entering a new era of the Law of Scaling. 'Test-time computing', which gives AI models more time and calculations to 'think' before answering questions, is particularly exciting as it could be the next big technology.
“We're seeing a new law of scaling emerge,” Microsoft CEO Satya Nadella said on stage at Microsoft Ignite on Tuesday, referring to the test-time computing research that underpins OpenAI's o1 model. .
He's not the only one who currently points to O1 as his future.
“We are now in the second era of the law of scaling: scaling during testing,” says Annie Mida, a partner at Andreessen Horowitz, a board member at Mistral, and an angel investor at Anthropic. he said in a recent interview with TechCrunch. .
If the unexpected success of previous AI scaling laws, and this sudden slowdown, tells us anything, it's that predicting when and how AI models will improve is extremely difficult. I mean it's difficult.
Either way, a paradigm shift appears to be underway. The way AI institutes seek to evolve their models over the next five years will likely not resemble the past five years.
What are the laws of AI scaling?
The rapid improvements in AI models that OpenAI, Google, Meta, and Anthropic have achieved since 2020 are mainly due to one important thing: using more compute and more data in the pre-training phase of AI models. It can be attributed to insight.
At this stage, when AI identifies and stores patterns in large datasets, when researchers give machine learning systems richer resources, the models perform better at predicting upcoming words and phrases. There is a tendency.
This first generation of AI scaling laws pushed the boundaries of what computers could do as engineers increased the number of GPUs they used and the amount of data fed to them. Even if this particular method finishes, the map has already been redrawn. Every Big Tech company is fundamentally focused on AI, and Nvidia, which supplies all of these companies with the GPUs they use to train their models, is currently the most valuable publicly traded company in the world.
However, these investments were also made with the expectation that scale expansion would continue as expected.
It is important to note that the laws of scaling are not laws of nature, physics, mathematics, or government. Continuing at the same pace is not guaranteed by anything or anyone. Even Moore's Law, another famous scaling law, eventually declined. However, it certainly lasted for a long time.
“The more compute you put in, the more data you put in, the bigger the model, the less profit you get,” Anyscale co-founder and former CEO Robert Nishihara said in an interview with TechCrunch. I did. “We also need new ideas to maintain the law of expansion and continue to increase the rate of progress.”
Mr. Nishihara is well versed in the laws of AI scaling. Anyscale reached a $1 billion valuation by developing software that helps OpenAI and other AI model developers scale AI training workloads to tens of thousands of GPUs. Anyscale is one of the biggest beneficiaries of the pre-trained scaling law of computing, but even its co-founders recognize that times are changing.
“Even if you read a million reviews on Yelp, you probably won't get as much out of your next Yelp review,” Nishihara said of the limits of data scaling. “But that's pre-training. I think the post-training methodology is pretty immature and leaves a lot of room for improvement.”
To be clear, AI model developers will likely continue to pursue large compute clusters and large datasets for pre-training, and there will probably be further improvements to come from these methods. . Elon Musk recently completed building a 100,000-GPU supercomputer called Colossus to train xAI's next model. There will be many more and larger clusters in the future.
But trends show that exponential growth is not possible with existing strategies by simply using more GPUs, so new techniques are suddenly gaining traction.
Computing at test: the next big bet in the AI industry
When OpenAI released a preview of the o1 model, the startup announced that it was part of a new model series separate from GPT.
OpenAI improved the GPT model primarily through traditional scaling laws: increasing data and increasing power during pre-training. However, it is now reported that they are not making much profit from that method. The o1 framework of models relies on a new concept called test-time computing. It is so called because the computing resources are used after the prompt rather than before. Although this technique has not yet been extensively studied in the context of neural networks, it has already shown promise.
Some are already pointing to test-time computing as the next way to scale AI systems.
“Many experiments show that test-time scaling laws that give the model more computation during inference can increasingly improve performance, even if the pre-training scaling law is slower. ” said Midha of a16z.
“OpenAI’s new ‘o’ series pushes [chain-of-thought] And it requires far more computing resources, or energy,” prominent AI researcher Yoshua Bengio said in an op-ed on Tuesday. “Thus, we see a new form of computational scaling emerge: not only more training data and larger models, but also more time spent “thinking” about the answer. ”
Over the course of 10 to 30 seconds, OpenAI's o1 model reprompts several times to break down a large problem into a series of smaller problems. Even though ChatGPT says it's “thinking,” it's not actually doing what a human would do. However, the internal problem-solving method, which benefits from a clear rephrasing of the problem and a step-by-step solution, is a key inspiration for this method.
About 10 years ago, Norm Brown, who now leads OpenAI's o1 effort, was trying to build an AI system that could beat humans at poker. In a recent talk, Brown said he noticed at the time that human poker players were taking longer to consider different scenarios before playing a hand. In 2017, he introduced a method of forcing models to “think” for 30 seconds before playing. During that time, the AI played various subgames and figured out how different scenarios would play out in order to determine the best move.
In the end, the AI performed seven times better than previous attempts.
Admittedly, Brown's 2017 study did not use neural networks, which were less popular at the time. But last week, MIT researchers published a paper showing that test-time computations can significantly improve an AI model's performance on inference tasks.
It's not immediately clear how test-time computing will scale. This can mean that AI systems take a very long time to think of difficult questions. Maybe a few hours, maybe a few days. Another approach is to have the AI model “think” questions to many chips at the same time.
If test-time computing really takes off as the next place to scale AI systems, demand for AI chips specialized for high-speed inference could increase dramatically, Midha says. This could be good news for startups like Groq and Cerebras that specialize in high-speed AI inference chips. When finding the answer is as computationally intensive as training a model, the AI “pick and shovel” provider wins again.
The AI world is not panicking yet.
Most of the AI industry seems unperturbed about these old scaling laws slowing down. Even if test-time compute doesn't prove to be the next wave of scaling, some feel they've only scratched the surface of current AI model applications.
A new popular product may buy time for AI model developers to find new ways to improve the underlying model.
“We are completely confident that pure application-level work can improve the performance of our models by at least 10-20x, allowing them to shine through intelligent prompting, UX decisions, and passing context at the right time. Models,” said Mida.
For example, ChatGPT's advanced voice mode is one of the most impressive applications of current AI models. However, this is primarily an innovation in user experience, not necessarily in the underlying technology. You'll see how further UX innovations, such as making that functionality accessible to the web or applications on your phone, can improve your product.
Kian Catanforouch, CEO of AI startup Workera and adjunct lecturer in deep learning at Stanford University, told TechCrunch that companies like his that build AI applications don't necessarily have the ability to build better products. He said we don't need an exponentially smarter model. He also said there is a lot of room for improvement in the current model of the product.
“Say you're building an AI application and the AI hallucinates on a certain task,” Katanforoosh says. “There are two ways to avoid that: Either the LLM gets better and the hallucinations disappear, or the tools around the LLM get better and we have the opportunity to solve the problem.”
Regardless of what happens on the AI research front, users probably won't feel the effects of these changes for some time. That said, AI Labs intends to do whatever it takes to continue shipping bigger, smarter, and faster models at the same pace. This means some big tech companies could pivot the way they push the boundaries of AI.