Papers
arxiv:2403.08295

Gemma: Open Models Based on Gemini Research and Technology

Published on Mar 13
Β· Submitted by akhaliq on Mar 14
#2 Paper of the day
Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language understanding, reasoning, and safety. We release two sizes of models (2 billion and 7 billion parameters), and provide both pretrained and fine-tuned checkpoints. Gemma outperforms similarly sized open models on 11 out of 18 text-based tasks, and we present comprehensive evaluations of safety and responsibility aspects of the models, alongside a detailed description of model development. We believe the responsible release of LLMs is critical for improving the safety of frontier models, and for enabling the next wave of LLM innovations.

Community

I was curious reading the section on the formatting used for the instruction tuning (pg4), where the developers used special tokens such as 'user' and 'model' to denote who is speaking. I have not read much on instruction tuning, so I was wondering if this was standard practice or a novel idea?

Β·
Paper author

Hi, Surya from the Gemma team -- there are different standards for how people demarcate turns and assign roles in fine-tuning data, but some template is almost used!

For instance, you may have seen the ChatML format (https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/ai-services/openai/includes/chat-markup-language.md) and other chat templates: https://huggingface.co/docs/transformers/main/en/chat_templating.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Gemma: Open-Sourced AI Excellence Derived from Google's Gemini

Links πŸ”—:

πŸ‘‰ Subscribe: https://www.youtube.com/@Arxflix
πŸ‘‰ Twitter: https://x.com/arxflix
πŸ‘‰ LMNT (Partner): https://lmnt.com/

By Arxflix
9t4iCUHx_400x400-1.jpg

Sign up or log in to comment

Models citing this paper 125

Browse 125 models citing this paper

Datasets citing this paper 1

Spaces citing this paper 52

Collections including this paper 17