allenai
/

tulu-2-dpo-13b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

hamishivi commited on Nov 18, 2023

Commit

c8d7325

•

1 Parent(s): 99c2700

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ base_model: meta-llama/Llama-2-13b-hf
 # Model Card for Tulu V2 DPO 13B
 Tulu is a series of language models that are trained to act as helpful assistants.
-Tulu V2 DPO 70B, and is a fine-tuned version of Llama 2 that was trained on on a mix of publicly available, synthetic and human datasets using [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290).
 This model is a strong alternative to Llama 2 13b Chat.
@@ -26,7 +26,7 @@ This model is a strong alternative to Llama 2 13b Chat.
 - **Model type:** The flagship model of a suite of instruction and RLHF tuned chat models on a mix of publicly available, synthetic and human-created datasets.
 - **Language(s) (NLP):** Primarily English
 - **License:** [AI2 ImpACT](https://allenai.org/impact-license) Low-risk license.
-- **Finetuned from model:** [meta-llama/Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf)
 ### Model Sources

 # Model Card for Tulu V2 DPO 13B
 Tulu is a series of language models that are trained to act as helpful assistants.
+Tulu V2 DPO 13B is a fine-tuned version of Llama 2 that was trained on on a mix of publicly available, synthetic and human datasets using [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290).
 This model is a strong alternative to Llama 2 13b Chat.
 - **Model type:** The flagship model of a suite of instruction and RLHF tuned chat models on a mix of publicly available, synthetic and human-created datasets.
 - **Language(s) (NLP):** Primarily English
 - **License:** [AI2 ImpACT](https://allenai.org/impact-license) Low-risk license.
+- **Finetuned from model:** [meta-llama/Llama-2-13b-hf](https://huggingface.co/meta-llama/Llama-2-13b-hf)
 ### Model Sources