jpacifico
/

Chocolatine-78B-Instruct-DPO-v1.3

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

jpacifico commited on 4 days ago

Commit

de0be44

•

1 Parent(s): b133074

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ pipeline_tag: text-generation
 ### Chocolatine-78B-Instruct-DPO-v1.3
-DPO fine-tuned of [dfurman/CalmeRys-78B-Orpo-v0.1](https://huggingface.co/dfurman/CalmeRys-78B-Orpo-v0.1) itself based on multiple fine tunings. Initialy based on foundation model [Qwen/Qwen2-72B-Instruct](https://huggingface.co/Qwen/Qwen2-72B-Instruct)
 using the [jpacifico/french-orca-dpo-pairs-revised](https://huggingface.co/datasets/jpacifico/french-orca-dpo-pairs-revised) rlhf dataset.
 My goal here is to verify whether the French DPO fine-tuning I developed for my Chocolatine model series can be applied with equal performance to model sizes > 70B params,

 ### Chocolatine-78B-Instruct-DPO-v1.3
+DPO fine-tuned of [dfurman/CalmeRys-78B-Orpo-v0.1](https://huggingface.co/dfurman/CalmeRys-78B-Orpo-v0.1) itself based on multiple fine tunings; initialy based on the foundation model [Qwen/Qwen2-72B-Instruct](https://huggingface.co/Qwen/Qwen2-72B-Instruct)
 using the [jpacifico/french-orca-dpo-pairs-revised](https://huggingface.co/datasets/jpacifico/french-orca-dpo-pairs-revised) rlhf dataset.
 My goal here is to verify whether the French DPO fine-tuning I developed for my Chocolatine model series can be applied with equal performance to model sizes > 70B params,