Benjamin Franklin once wrote that nothing is certain except death and taxes. Let us revise that phrase to reflect the current AI gold rush: nothing is certain except death, taxes, and new AI models, the last of which is emerging at an unprecedented pace.
Earlier this week, Google released its upgraded Gemini model, and earlier this month OpenAI unveiled its o1 model, but on Wednesday it was Meta's turn to show off its latest model at Meta Connect 2024, its annual developer conference in Menlo Park.
Llama diversity
Meta's family of multilingual Llama models has reached version 3.2. With this upgrade from 3.1, several Llama models are now multimodal. The compact model Llama 3.2 11B and the larger, more capable 90B can interpret charts and graphs, caption images, and pinpoint objects in photos, given a simple description.
For example, given a map of a park, Llama 3.2 11B and 90B might be able to answer questions like, “When is the terrain steepest?” or “How far is this path?” Or, given a graph showing a company's revenue over the course of a year, the model might be able to quickly identify its best-performing months.
Meta says that Llama 3.2 11B and 90B are designed as “drop-in” replacements for 3.1 for developers who want to use the models exclusively for text applications. 11B and 90B can be deployed with or without Llama Guard Vision, a new safety tool designed to detect potentially harmful (i.e. biased or harmful) text and images input into or generated by the model.
The multimodal Llama model is available for download in most parts of the world from a variety of cloud platforms, including Hugging Face, Microsoft Azure, Google Cloud, and AWS. Meta also hosts the model on the official Llama site, Llama.com, and uses it as the basis for their AI assistant, Meta AI, on WhatsApp, Instagram, and Facebook.
Image credit: Meta
However, Llama 3.2 11B and 90B are not accessible in Europe. As a result, some Meta AI features available in other regions, such as image analysis, are disabled for European users. Meta again blamed the “unpredictable” nature of the EU regulatory environment.
Meta expressed concerns about the AI Act, the EU law that sets out the legal and regulatory framework for AI, and rejected the associated voluntary safety pledge. Among other requirements, the AI Act requires companies developing AI in the EU to indicate whether their models may be deployed in “high risk” situations, such as policing. Meta worries that because the models are “open,” little information will be available about how they are being used, making it difficult to comply with the AI Act's rules.
Also problematic for Meta are the AI training provisions of GDPR, the EU's sweeping privacy law. Meta trains its models using publicly available data from Instagram and Facebook users who haven't opted out, which is covered under GDPR guarantees in Europe. EU regulators asked Meta earlier this year to stop training with European user data while they evaluate the company's GDPR compliance.
Although Meta has softened its stance, it also supported an open letter calling for a “modern interpretation” of GDPR that “does not reject progress.”
Earlier this month, Meta[incorporating] The company has incorporated “feedback from regulators” into its revised opt-out procedures, but has yet to share an update on the training with the rest of the EU.
A more compact model
The other new Llama models (which have not been trained on European user data) will be released in Europe (and worldwide) on Wednesday.
Llama 3.2 1B and 3B are lightweight, text-only models designed to run on smartphones and other edge devices, and can be applied to tasks such as summarizing and rewriting paragraphs (e.g., emails). Optimized for Qualcomm and MediaTek Arm hardware, 1B and 3B can also take advantage of tools such as calendar apps with just a little setup and perform actions autonomously, according to Meta.
There is still no successor to the flagship Llama 3.1 405B, multimodal or otherwise, released in August. Given the massive size of the 405B (which took months to train), this is likely a matter of limited computing resources. We've asked Meta if other factors are at play and will update this article if we hear back.
Meta's new Llama Stack is a set of Llama-specific development tools that can be used to fine-tune all of the models in Llama 3.2 (1B, 3B, 11B, 90B). Meta says that regardless of how you customize it, the models can handle up to about 100,000 words at a time.
Image credit: Meta
Strategies for Gaining Mind Share
Meta CEO Mark Zuckerberg often talks about making the “benefits and opportunities” of AI accessible to all people, but implicit in this statement is that he hopes those tools and models will be created by Meta.
Investing in a commoditizable model will force competitors (OpenAI, Anthropic, etc.) to lower their prices, allowing Meta's version of AI to become more widely available and allowing Meta to incorporate improvements from the open source community. Meta claims that its Llama model has been downloaded over 350 million times and is used by large companies such as Zoom, AT&T, and Goldman Sachs.
For many of these developers and companies, the fact that the Llama model isn't “open” in the strictest sense doesn't matter: Meta's license restricts how certain developers can use the model. Platforms with more than 700 million monthly users must apply for a special license from Meta, which it will grant at its sole discretion.
Admittedly, there aren't many platforms of that size that don't have their own in-house model. But Meta hasn't been particularly transparent about its process. When asked this month whether the company had exercised discretion to approve Llama licenses for the platform, a spokesperson said, “We don't have any information to share on this matter.”
Don't get me wrong, Meta is serious about this. It's spending millions lobbying regulators to win over its preferred version of “open” AI, and it's pouring billions into servers, data centers, and network infrastructure to train its future models.
None of the Llama 3.2 models solve the big problems in AI today, such as the tendency to make up and spit out problematic training data (e.g., the potential use of copyrighted e-books without permission, which is the subject of a class-action lawsuit against Meta.) But as I've written before , they advance one of Meta's key goals: to become synonymous with AI, and generative AI in particular.