Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ base_model: meta-llama/Llama-2-13b-hf
|
|
17 |
# Model Card for Tulu V2 DPO 13B
|
18 |
|
19 |
Tulu is a series of language models that are trained to act as helpful assistants.
|
20 |
-
Tulu V2 DPO
|
21 |
This model is a strong alternative to Llama 2 13b Chat.
|
22 |
|
23 |
|
@@ -26,7 +26,7 @@ This model is a strong alternative to Llama 2 13b Chat.
|
|
26 |
- **Model type:** The flagship model of a suite of instruction and RLHF tuned chat models on a mix of publicly available, synthetic and human-created datasets.
|
27 |
- **Language(s) (NLP):** Primarily English
|
28 |
- **License:** [AI2 ImpACT](https://allenai.org/impact-license) Low-risk license.
|
29 |
-
- **Finetuned from model:** [meta-llama/Llama-2-
|
30 |
|
31 |
### Model Sources
|
32 |
|
|
|
17 |
# Model Card for Tulu V2 DPO 13B
|
18 |
|
19 |
Tulu is a series of language models that are trained to act as helpful assistants.
|
20 |
+
Tulu V2 DPO 13B is a fine-tuned version of Llama 2 that was trained on on a mix of publicly available, synthetic and human datasets using [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290).
|
21 |
This model is a strong alternative to Llama 2 13b Chat.
|
22 |
|
23 |
|
|
|
26 |
- **Model type:** The flagship model of a suite of instruction and RLHF tuned chat models on a mix of publicly available, synthetic and human-created datasets.
|
27 |
- **Language(s) (NLP):** Primarily English
|
28 |
- **License:** [AI2 ImpACT](https://allenai.org/impact-license) Low-risk license.
|
29 |
+
- **Finetuned from model:** [meta-llama/Llama-2-13b-hf](https://huggingface.co/meta-llama/Llama-2-13b-hf)
|
30 |
|
31 |
### Model Sources
|
32 |
|