Zuckerberg says Meta needs 10 times the computing power to train Llama 4 compared to Llama 3

Meta, which develops Llama, the largest open source large-scale language model, believes that training models in the future will require much more computing power.

Mark Zuckerberg said during Meta's second-quarter earnings call on Tuesday that training Llama 4 will require 10 times the computing power that was needed to train Llama 3. But rather than falling behind its competitors, he wants Meta to build up its capacity to train models.

“The amount of compute required to train Llama 4 will likely be almost 10 times what we used to train Llama 3, and future models will continue to grow beyond that,” Zuckerberg said.

“How this will play out over the next few generations is difficult to predict, but right now, given the long lead times to launch new inference projects, I think it's better to build capacity before you need it, rather than risk building it before it's too late.”

Meta released Llama 3 in April, with 80 billion parameters. Last week, the company released an upgraded version of the model called Llama 3.1 405B, with 405 billion parameters, making it Meta's largest open source model.

Meta CFO Susan Li also said the company is exploring various data center projects and building capacity to train future AI models, and that the investment is expected to drive capital expenditures up in 2025, she said.

Training large language models can be a costly undertaking, and Meta's capital expenditures rose nearly 33% to $8.5 billion in the second quarter of 2024 from $6.4 billion in the same period last year, driven by investments in servers, data centers and network infrastructure.

According to a report from The Information, OpenAI is spending $3 billion to train its models and another $4 billion to rent servers at a discount from Microsoft.

“As we expand our generative AI training capacity to evolve our underlying models, we continue to build our infrastructure in a way that gives us flexibility in how we use it over time, allowing us to direct training capacity towards generative AI inference or our core ranking and recommendation work where we expect it will be more valuable,” Lee said on the conference call.

During the call, Meta also discussed usage of its consumer-facing Meta AI, noting that India is the largest market for its chatbots, but Li said the company does not expect its Gen AI product to contribute significantly to revenue.

Source link

Subscribe to Updates

What's Hot

Zuckerberg says Meta needs 10 times the computing power to train Llama 4 compared to Llama 3

Related Posts

Leave A Reply Cancel Reply

Subscribe to Updates