metadata
license: mit
library_name: transformers
datasets:
- argilla/dpo-mix-7k
base_model: wandb/mistral-7b-zephyr-sft
Mistral 7B Zephyr DPO V2
The Zephyr DPO recipe applied on top of Mistral 7B (new recipe with chatML format)
Model description
- Model type: A 7.2B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
- Language(s) (NLP): Primarily English
- Finetuned from model: wandb/mistral-7b-zephyr-sft
Recipe
We trained using the alignment handbook recipe and logging to W&B
Visit the W&B workspace here