Post
Introducing: Zephyr Gemma!
The community has struggled to do a good preference-tune of Gemma, so the amazing @lewtun and @philschmid built an open-source recipe and trained a model to help people get started.
Handbook: https://github.com/huggingface/alignment-handbook/blob/main/recipes/zephyr-7b-gemma/README.md
Model: HuggingFaceH4/zephyr-7b-gemma-v0.1
Demo: HuggingFaceH4/zephyr-7b-gemma-chat
Some interesting details
- Fine-tuned on DEITA and DPOed with Argilla DPO dataset
- Very strong MT Bench results (7.81), better than Zephyr Beta (mistral based) and Gemma Instruct
- Can run locally with tools such as llama.cpp on a Mac
- Not so good AGIEval results compared to mistral-based tunes
- All training code is open-sourced
- Trained for 105 minutes on 8x H100
- No system message
Big kudos to the team! Super exciting to see a good fine-tune for Gemma
The community has struggled to do a good preference-tune of Gemma, so the amazing @lewtun and @philschmid built an open-source recipe and trained a model to help people get started.
Handbook: https://github.com/huggingface/alignment-handbook/blob/main/recipes/zephyr-7b-gemma/README.md
Model: HuggingFaceH4/zephyr-7b-gemma-v0.1
Demo: HuggingFaceH4/zephyr-7b-gemma-chat
Some interesting details
- Fine-tuned on DEITA and DPOed with Argilla DPO dataset
- Very strong MT Bench results (7.81), better than Zephyr Beta (mistral based) and Gemma Instruct
- Can run locally with tools such as llama.cpp on a Mac
- Not so good AGIEval results compared to mistral-based tunes
- All training code is open-sourced
- Trained for 105 minutes on 8x H100
- No system message
Big kudos to the team! Super exciting to see a good fine-tune for Gemma