HenryJJ
/

dolphin-2.6-mistral-7b-dpo-orca-v1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

HenryJJ commited on Jan 14

Commit

3b71102

•

1 Parent(s): 5a78925

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -8,6 +8,8 @@ datasets:
 Dpo trained from cognitivecomputations/dolphin-2.6-mistral-7b， used Intel/orca_dpo_pairs for the dataset.
 Trained for 1200 steps.  Trained with 1024 context window.
 # Model Details
 * **Trained by**: trained by HenryJJ.
 * **Model type:**  **dolphin-2.6-mistral-7b-dpo-orca** is an auto-regressive language model based on the Llama 2 transformer architecture.

 Dpo trained from cognitivecomputations/dolphin-2.6-mistral-7b， used Intel/orca_dpo_pairs for the dataset.
 Trained for 1200 steps.  Trained with 1024 context window.
+Training code: https://github.com/hengjiUSTC/learn-llm/blob/main/dpo_demo.ipynb
 # Model Details
 * **Trained by**: trained by HenryJJ.
 * **Model type:**  **dolphin-2.6-mistral-7b-dpo-orca** is an auto-regressive language model based on the Llama 2 transformer architecture.