Elon Musk's xAI released its Grok large-scale language model as “open source” over the weekend. The billionaire clearly wants to pit his company against rival OpenAI, which despite its name is not very open. But does publishing the code for something like Grok actually contribute to the AI development community? Yes and no.
Grok is a chatbot trained by xAI to play the same vaguely defined role as the likes of ChatGPT and Claude. Ask a question and we'll answer. However, this LLM was given a cheeky tone and special access to Twitter data as a way to differentiate itself from the rest.
As always, these systems are nearly impossible to evaluate, but the general consensus seems to be that they can compete with previous generation mid-range models like GPT-3.5. (Whether you decide this is great given the short development time, or disappointing given the budget and bombast surrounding xAI, is entirely up to you.)
Either way, Grok is a modern, functional LLM with significant size and features, and the more the development community can access the essence of such a thing, the better. The problem lies in defining “open” in more ways than allowing corporations (or billionaires) to claim moral superiority.
This isn't the first time the terms “open” and “open source” have been questioned and abused in the world of AI. And we're not just talking about technical nonsense like choosing a usage license that isn't as open as others (Grok is Apache 2.0).
First, AI models are different from other software in that they are “open source.”
For example, if you're building a word processor, it's relatively easy to open source it. We publish all our code and allow the community to suggest improvements and create their own versions. One of the things that makes open source as a concept so valuable is that every aspect of an application is either original or credited to the original creator. This transparency and adherence to correct attribution is not just a by-product, but is at the core of the very concept of openness.
With AI, this is probably not possible at all. How machine learning models are created involves a process that is largely unknown, and vast amounts of training data are transformed into complex statistical representations that humans cannot actually direct or even understand their structure. is distilled into . This process cannot be inspected, audited, and improved like traditional code. So in some ways it's still invaluable, but you can't really open it up. (The standards community hasn't even defined what is open in this context, but is actively discussing it.)
That hasn't stopped AI developers and companies from designing and claiming their models as “open,” but the term has largely lost meaning in this context. Some people call a model “open” if it has a public interface or API. Some people refer to publishing a paper describing the development process as “open.”
Perhaps the closest an AI model comes to being “open source” is when a developer releases its weights, the precise attributes of the countless nodes of a neural network, and the precise order in which they can be used to complete the pattern started. Perform vector math operations with . By user input. But even an “open weight” model like LLaMa-2 excludes other important data such as training datasets and processes. These data will be needed to rebuild from scratch. (Of course, there are also more advanced projects.)
All of this is before even mentioning the fact that creating or replicating these models requires millions of dollars of computing and engineering resources, and being able to create and replicate them requires significant Effectively restricted to companies with the resources.
So where does xAI's Grok release fall on this spectrum?
It's an open weight model, so anyone can download it, use it, modify it, tweak it, and distill it. that's good! This appears to be one of the largest models in terms of parameters freely accessible to anyone in this way (314 billion). This allows a lot of work for curious engineers who want to test how it behaves after making various changes.
However, the size of the model has a significant drawback. This raw form requires hundreds of gigabytes of high-speed RAM. For example, if you don't already own 12 of his Nvidia H100s in a 6-digit AI inference rig, there's no need to bother clicking that download link.
And while the Grok will likely compete with some other modern models, it's much larger than them and will require more resources to accomplish the same thing. There is always a hierarchy of size, efficiency, and other metrics that still have value, but this is more of a raw material than a final product. It's also not clear if this is the latest and greatest version of Grok, similar to the clearly tweaked version that some people have access to via his X.
Overall, releasing this data is a good thing, but it doesn't significantly change the situation as some might have hoped.
I also can't help but wonder why Mr. Musk is doing this. Is his fledgling AI company really dedicated to open source development? Or is this just mud for his OpenAI, which Musk is currently pursuing billionaire-level goals for? I wonder?
If they are truly committed to open source development, this will likely be the first of many releases. They will hopefully take into account feedback from the community, release other important information, characterize the training data process, and further explain their approach. Even if that's not the case and this was only done so Musk could point it out in online discussions, it's still valuable. It's just not a model that anyone in the AI world will be playing with and trusting or paying much attention to in the coming months.