Databricks spent $10 million on new DBRX-generated AI model, but it couldn't beat GPT-4

If you wanted to raise the profile of a major technology company and you had $10 million to spend, what would you do with it? In a Super Bowl ad? F1 sponsor?

You can also spend it training generative AI models. Although not marketing in the traditional sense, generative models are attention-grabbing and increasingly focused on a vendor's fundamental products and services.

Check out Databricks' DBRX, a new generative AI model similar to OpenAI's GPT series and Google's Gemini, announced today. Available for research and commercial use on GitHub and the AI development platform Hugging Face, DBRX Base and fine-tuned versions (DBRX Instruct) can be run and tuned on public, custom, or your own data. Masu.

“DBRX is useful and trained to provide information on a variety of topics,” Naveen Rao, Databricks' vice president of generative AI, told TechCrunch in an interview. “DBRX is optimized and tailored for English usage, but can speak and translate in a variety of languages, including French, Spanish, and German.”

Databricks describes DBRX as “open source,” similar to “open source” models like Meta's Llama 2 and AI startup Mistral's models. (Whether these models truly meet the definition of open source is a subject of intense debate.)

Databricks says it spent about $10 million and eight months training DBRX, and claims (quoted from a press release) that it is “outperforming.”[s] Evaluate all existing open source models against standard benchmarks. ”

But here's the marketing issue: Unless you're a Databricks customer, DBRX is very difficult to use.

This is because running DBRX in a standard configuration requires a server or PC with at least four Nvidia H100 GPUs. The H100 costs a few thousand dollars apiece, probably more. While this may be a big change for the average company, it's out of reach for many developers and solopreneurs.

There are also some small details. Databricks says companies with more than 700 million active users will face “certain limitations” comparable to Llama 2's Meta, ensuring all users use DBRX “responsibly” states that you must agree to the terms and conditions. (Databricks did not voluntarily provide details of these terms at the time of publication.)

Databricks offers the Mosaic AI Foundation Model product as a managed solution to these obstacles. In addition to running DBRX and other models, it provides a training stack for fine-tuning DBRX with custom data. Rao suggested that customers can use his Databricks Model Serving offering to privately host his DBRX or work with Databricks to deploy his DBRX on the hardware of their choice. did.

Rao added:

Ultimately, the benefit for Databricks is more users on the platform, as we are focused on making the Databricks platform the best choice for building customized models. DBRX demonstrates a best-in-class pre-training and tuning platform that allows customers to build their own models from scratch. This is an easy way for customers to start using his Databricks Mosaic AI-generated AI tools. DBRX is also highly capable out of the box and can be tuned to perform better on specific tasks with better economics than larger closed models.

Databricks claims that DBRX runs up to 2x faster than Llama 2 thanks to its Mix of Experts (MoE) architecture. MoE, which DBRX shares with Llama 2, Mistral's new model, and Google's recently announced Gemini 1.5 Pro, essentially splits a data processing task into multiple subtasks and divides these subtasks into smaller, specialized Delegate to the “expert” model.

Most MoE models have eight experts. DBRX has 16, which improves quality, he says Databricks.

However, quality is relative.

Databricks claims that DBRX outperforms Llama 2 and Mistral models in certain language understanding, programming, math and logic benchmarks, but in most areas outside of niche use cases such as database programming. DBRX falls short of OpenAI's GPT-4, which is probably the leading generative AI model. Language production.

Rao acknowledges that DBRX has other limitations. This means that, like all generative AI models, despite Databricks' commitment to safety testing and red teaming, they can fall victim to “illusory” answers to queries. . Because the model is simply trained to associate words and phrases with specific concepts, its responses are not necessarily accurate if those associations are not completely accurate.

DBRX is also not multimodal, unlike recent flagship generative AI models such as Gemini. (It can only process and generate text, not images.) Also, we don't know exactly what data sources were used for training. Rao only clarified that in training DBRX he did not use Databricks customer data.

“We trained DBRX on large datasets from a variety of sources,” he added. “We used open data sets that the community knows, loves, and uses every day.”

I asked Rao if any of the DBRX training datasets were copyrighted or licensed, or if there were any obvious signs of bias (such as racial bias). , he didn't answer directly, just said: He then conducted a red team exercise to improve the model's weaknesses. ” Generative AI models are prone to backflow of training data, which is a major concern for commercial users of models trained on unlicensed, copyrighted, or clearly biased data. In the worst-case scenario, users can unknowingly introduce IP infringement or biased work from their models into their projects, leading to ethical and legal issues.

Some companies that train and release generative AI models offer policies that cover legal costs resulting from potential infringement. Databricks currently does not. Rao said the company is “exploring scenarios” where that could happen.

Considering this and other aspects where DBRX misses the mark, this model seems like a tough sell to anyone other than current or future Databricks customers. Databricks' competitors in generative AI, including OpenAI, offer comparable, if not more attractive, technology at very competitive prices. And many generative AI models are closer to the commonly understood open source definition than DBRX.

Rao promises that Databricks will continue to improve DBRX and release new versions as the company's Mosaic Labs R&D team (the team behind DBRX) explores new generative AI avenues.

“DBRX is advancing the field of open source models and challenging ourselves to build future models more efficiently,” he said. “We will continue to release variants as we apply techniques that improve output quality in terms of reliability, safety, and bias. The open model is a platform where customers can build custom functionality using our tools. I think there is.”

Judging by DBRX's current position relative to its peers, it has a very long road ahead.

Source link

Subscribe to Updates

What's Hot

Databricks spent $10 million on new DBRX-generated AI model, but it couldn't beat GPT-4

Related Posts

Leave A Reply Cancel Reply

Subscribe to Updates