When it rains, it pours when it comes to cutting-edge AI models. Mistral released its latest flagship model, the Large 2, on Wednesday, which the company claims is on par with the latest state-of-the-art models from OpenAI and Meta in terms of code generation, mathematics and inference.
The release of Mistral Large 2 comes just one day after Meta released its latest and greatest open source model, Llama 3.1 405b. Mistral says Large 2 raises the bar for performance and cost for open models, and they back that up with some benchmarks.
The Large 2 appears to outperform Llama 3.1 405B in terms of code generation and math performance, but does so with less than a third of the parameters – 123 billion to be exact.
Mistral said in a press release that one of its main focuses during training was minimizing the issue of hallucinations in the model. The company said Large 2 was trained to respond more discriminatingly, such as admitting when it doesn't know something, rather than making up plausible ones.
Mistral, a Paris-based AI startup, recently raised $640 million in a Series B funding round led by General Catalyst, bringing its valuation to $6 billion. Although Mistral is a new entrant in the artificial intelligence space, it is rapidly shipping AI models that are at or near the state of the art.
However, it's important to note that Mistral's model, like most others, is not open source in the traditional sense: commercial use of the model requires a paid license, and while it's more open than something like GPT-4o, very few companies in the world have the expertise and infrastructure to implement such a large model (which, of course, is twice as large as Llama's 405 billion parameters).
One feature missing from Mistral Large 2, and also missing from Meta's Llama 3.1, released yesterday, is multimodal capabilities. OpenAI is far ahead of its competitors when it comes to multimodal AI systems that can process images and text simultaneously, and it's a feature that some startups are increasingly hoping to build on.
The model has a window of 128,000 tokens, meaning Large 2 can ingest a lot of data in a single prompt (128,000 tokens is roughly the equivalent of a 300-page book). Mistral's new model also includes improved multilingual support: Large 2 understands English, French, German, Spanish, Italian, Portuguese, Arabic, Hindi, Russian, Chinese, Japanese, and Korean, as well as 80 programming languages. Notably, Mistral claims Large 2 generates more concise responses than leading AI models, which tend to be long-winded.
Mistral Large 2 is available in Google Vertex AI, Amazon Bedrock, Azure AI Studio and IBM watsonx.ai. The new model is also available on Mistral's le Plateforme under the name “mistral-large-2407” and can be tested for free on the startup's ChatGPT competitor, le Chat.