All-around highly generalizable generative AI models were, and perhaps still are, the name of the game. But as cloud vendors large and small increasingly join the generative AI fray, new models are emerging that focus on the most deep-pocketed potential customers: enterprises.
Case in point: Cloud computing company Snowflake today announced Arctic LLM, a generative AI model described as “enterprise grade.” Snowflake says Arctic LLM, available under the Apache 2.0 license, is optimized for “enterprise workloads,” including database code generation, and is free for research and commercial use.
“I think this is the foundation for us at Snowflake and our customers to build enterprise-grade products and really start to realize the promise and value of AI,” CEO Sridhar Ramaswamy said at a press conference. Stated. “This should be considered our first but major step in the world of generative AI, and there will be many more to come.”
enterprise model
My colleague Devin Coldewey recently wrote about the onslaught of generative AI models with no end in sight. He recommends reading his article, but the gist is: Models are an easy way for vendors to increase R&D excitement and also serve as a funnel into the product ecosystem (model hosting, fine-tuning, etc.). .
Arctic LLM is no exception. Snowflake's flagship model, a family of generative AI models called Arctic, Arctic LLM, takes about three months, 1,000 GPUs, and $2 million to train, and is powered by Databricks, another enterprise-optimized generative AI model. It comes after DBRX. space.
Snowflake directly compared Arctic LLM and DBRX in its press materials, stating that Arctic LLM outperforms DBRX in two tasks: coding (Snowflake did not specify which programming language) and SQL generation. The company says the Arctic LLM outperforms Meta's Llama 2 70B (but not the more recent Llama 3 70B) and Mistral's Mixtral-8x7B at these tasks.
Snowflake also claims that Arctic LLM has the “best performance” in MMLU, a common language understanding benchmark. However, while MMLU aims to assess the reasoning ability of generative models through logical questions, it also includes tests that can be solved by rote memorization, so take that with a grain of salt.
“Arctic LLM addresses a specific need within the enterprise sector, branching out from common AI applications like poetry composition to create a SQL common “We focus on enterprise challenges such as development.” Pilot and high quality chatbot. ”
Arctic LLM combines the same Master of Expertise (MoE) architecture as DBRX and Google's current highest performing generation model, Gemini 1.5 Pro. The MoE architecture essentially divides data processing tasks into subtasks and delegates them to smaller, specialized “expert” models. In other words, Arctic LLM contains 480 billion parameters, but only 17 billion can be active at a time, which is enough to drive 128 separate expert models. (Parameters essentially define the AI model's skills for the problem, such as text analysis and generation.)
Snowflake says this efficient design allows Arctic LLM to be trained on open public web datasets (including RefinedWeb, C4, RedPajama, and StarCoder) at “about one-eighth the cost of similar models.” It is claimed that it has become.
run anywhere
Along with Arctic LLM, Snowflake provides resources such as coding templates and a list of training sources to guide users through the process of getting their models up and running and fine-tuning them for specific use cases. However, recognizing that these are likely to be expensive and complex tasks for most developers (tweaking or running Arctic LLM requires approximately 8 GPUs), Snowflake also provides Hugging Face, We're also committed to making Arctic LLM available on a variety of hosts, including Microsoft Azure. , Together AI's model hosting service, and enterprise generation AI platform Lamini.
However, here's the problem. Arctic LLM will first be available on Cortex, Snowflake's platform for building AI and machine learning-powered apps and services. Unsurprisingly, the company touts this as the preferred way to run Arctic LLM, with its “security,” “governance,” and scalability.
“Our dream here is to have an API that our customers can use within a year, allowing business users to interact directly with their data,” Ramaswamy said. “It would have been easy for us to say, “Oh, let's wait until the open source model comes along and use that.'' Instead, we made foundational investments by thinking like this: going. [it’s] There will be more value for our customers. ”
So I'm wondering. Who is Arctic LLM really for other than Snowflake customers?
In a world full of “open” generative models that can be tweaked for virtually any purpose, Arctic LLM doesn't stand out. Its architecture may make it more efficient than other options. But I'm not convinced they'll be drastic enough to move companies away from the myriad other well-known and supported business-friendly generative models (such as GPT-4).
The Arctic LLM aversion also comes with some considerations. That's its relatively small background.
In generative AI, a context window refers to input data (such as text) that a model considers before producing output (such as additional text). Models with small context windows tend to forget even the most recent conversations, whereas models with large context windows typically avoid this pitfall.
The Arctic LLM's context ranges from about 8,000 to about 24,000 words depending on how you tweak it, which is much less than models like Anthropic's Claude 3 Opus or Google's Gemini 1.5 Pro.
Although Snowflake doesn't mention it in its marketing, Arctic LLM almost certainly suffers from the same limitations and drawbacks as other generative AI models: hallucinations (i.e., confidently answering requests incorrectly). That's because Arctic LLM, like every other generative AI model in existence, is a statistical probability machine, which again has a small context window. Based on a large number of examples, guess what data makes the most “meaning” to place where (for example, the word “go” before “market” in the sentence “go to market”) word). It will inevitably make a wrong guess – and it is a “hallucination”.
As Devin writes in his article, until the next big technological breakthrough occurs, all we have to expect in the generative AI space is incremental improvements. However, that doesn't stop vendors like Snowflake from championing them as great achievements and marketing them to their fullest potential.