Update README.md
Browse files
README.md
CHANGED
@@ -8,6 +8,8 @@ datasets:
|
|
8 |
Dpo trained from cognitivecomputations/dolphin-2.6-mistral-7b, used Intel/orca_dpo_pairs for the dataset.
|
9 |
Trained for 1200 steps. Trained with 1024 context window.
|
10 |
|
|
|
|
|
11 |
# Model Details
|
12 |
* **Trained by**: trained by HenryJJ.
|
13 |
* **Model type:** **dolphin-2.6-mistral-7b-dpo-orca** is an auto-regressive language model based on the Llama 2 transformer architecture.
|
|
|
8 |
Dpo trained from cognitivecomputations/dolphin-2.6-mistral-7b, used Intel/orca_dpo_pairs for the dataset.
|
9 |
Trained for 1200 steps. Trained with 1024 context window.
|
10 |
|
11 |
+
Training code: https://github.com/hengjiUSTC/learn-llm/blob/main/dpo_demo.ipynb
|
12 |
+
|
13 |
# Model Details
|
14 |
* **Trained by**: trained by HenryJJ.
|
15 |
* **Model type:** **dolphin-2.6-mistral-7b-dpo-orca** is an auto-regressive language model based on the Llama 2 transformer architecture.
|