Just a week after launching the latest version of its Gemini model, Google today announced the launch of Gemma, a new family of lightweight openweight models. Starting with Gemma 2B and Gemma 7B, these new models are “Gemini-inspired” and are available for commercial and research use.
Google did not provide a detailed paper on how these models would perform against similar models from, say, Meta or Mistral, only saying that they were “state-of-the-art.” However, the company notes that these are high-density decoder-only models, the same architecture it used for its Gemini models (and earlier his PaLM models), and that benchmarks will be released later today on his Hugging Face leaders. It says what you can see on the board.
To get started with Gemma, developers have access to ready-to-use Colab and Kaggle notebooks, as well as integration with Hugging Face, MaxText, and Nvidia's NeMo. These models can be pre-trained and tuned to run anywhere.
Although Google emphasizes that these are open models, please note that they are not open source. In fact, in a press conference ahead of today's announcement, Google's girlfriend Jeanine Banks emphasized the company's commitment to open source, but was very intentional about how Google references her Gemma model. He also said that.
“[Open models] It’s now pretty ingrained in the industry,” Banks said. “And that often refers to open-weight models, where developers and researchers have broad access to customize and fine-tune the models, but at the same time they are free to redistribute and own variants of them. Terms and conditions apply, such as rights to open source. , we decided that it made the most sense to call the Gemma model an open model.”
This means developers can use the models for inference and fine-tune them at will, and the Google team claims that the size of these models is suitable for many use cases.
“The quality of generation has improved significantly over the last year,” said Tris Warkentin, director of product management at Google DeepMind. “What was previously only possible with very large models is now possible with modern small models. We're excited about this, which unlocks entirely new ways to develop AI applications, including the ability to run and orchestrate inference on a single host on GCP.”
This also applies to the open models of Google's competitors in this space. Therefore, we need to see how the Gemma model performs in a real-world scenario.
In addition to the new model, Google is also releasing a new Responsible Generation AI toolkit that provides “guidance and essential tools for creating more secure AI applications with Gemma” and debugging tools.