G-reen
/

EXPERIMENT-DPO-m7b2-1-merged

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

G-reen commited on Mar 25, 2024

Commit

79d8e59

•

1 Parent(s): bbc9f45

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -1,5 +1,9 @@
 *This model was trained as part of a series of experiments testing the performance of pure DPO vs SFT vs ORPO, all supported by Unsloth/Huggingface TRL.*
 **Training Details**
 Dataset: https://huggingface.co/datasets/argilla/dpo-mix-7k

 *This model was trained as part of a series of experiments testing the performance of pure DPO vs SFT vs ORPO, all supported by Unsloth/Huggingface TRL.*
+**Benchmarks**
+TBA
 **Training Details**
 Dataset: https://huggingface.co/datasets/argilla/dpo-mix-7k