natolambert
commited on
Commit
•
bc6c48f
1
Parent(s):
9b9d8d5
Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,9 @@ base_model: meta-llama/Llama-2-70b-hf
|
|
20 |
|
21 |
# Model Card for Tulu V2 DPO 70B
|
22 |
|
23 |
-
|
|
|
|
|
24 |
|
25 |
|
26 |
## Model description
|
@@ -28,7 +30,7 @@ Zephyr is a series of language models that are trained to act as helpful assista
|
|
28 |
- **Model type:** The flagship model of a suite of instruction and RLHF tuned chat models on a mix of publicly available, synthetic and human-created datasets.
|
29 |
- **Language(s) (NLP):** Primarily English
|
30 |
- **License:** MIT
|
31 |
-
- **Finetuned from model:** [meta-llama/Llama-2-70b-hf](https://huggingface.co/
|
32 |
|
33 |
### Model Sources
|
34 |
|
|
|
20 |
|
21 |
# Model Card for Tulu V2 DPO 70B
|
22 |
|
23 |
+
Tulu is a series of language models that are trained to act as helpful assistants.
|
24 |
+
Tulu V2 DPO 70B, and is a fine-tuned version of Llama 2 that was trained on on a mix of publicly available, synthetic and human datasets using [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290).
|
25 |
+
This model is a strong alternative to Llama 2 70b Chat.
|
26 |
|
27 |
|
28 |
## Model description
|
|
|
30 |
- **Model type:** The flagship model of a suite of instruction and RLHF tuned chat models on a mix of publicly available, synthetic and human-created datasets.
|
31 |
- **Language(s) (NLP):** Primarily English
|
32 |
- **License:** MIT
|
33 |
+
- **Finetuned from model:** [meta-llama/Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf)
|
34 |
|
35 |
### Model Sources
|
36 |
|