link to GGUF version
Browse files
README.md
CHANGED
@@ -23,6 +23,8 @@ language:
|
|
23 |
This is an ORPO fine-tune of [google/gemma-2b](https://huggingface.co/google/gemma-2b) with
|
24 |
[`alvarobartt/dpo-mix-7k-simplified`](https://huggingface.co/datasets/alvarobartt/dpo-mix-7k-simplified).
|
25 |
|
|
|
|
|
26 |
## ORPO
|
27 |
[ORPO (Odds Ratio Preference Optimization)](https://arxiv.org/abs/2403.07691) is a new training paradigm that combines the usually separated phases
|
28 |
of SFT (Supervised Fine-Tuning) and Preference Alignment (usually performed with RLHF or simpler methods like DPO).
|
|
|
23 |
This is an ORPO fine-tune of [google/gemma-2b](https://huggingface.co/google/gemma-2b) with
|
24 |
[`alvarobartt/dpo-mix-7k-simplified`](https://huggingface.co/datasets/alvarobartt/dpo-mix-7k-simplified).
|
25 |
|
26 |
+
**⚡ Quantized version (GGUF)**: https://huggingface.co/anakin87/gemma-2b-orpo-GGUF
|
27 |
+
|
28 |
## ORPO
|
29 |
[ORPO (Odds Ratio Preference Optimization)](https://arxiv.org/abs/2403.07691) is a new training paradigm that combines the usually separated phases
|
30 |
of SFT (Supervised Fine-Tuning) and Preference Alignment (usually performed with RLHF or simpler methods like DPO).
|