Update README.md
Browse files
README.md
CHANGED
@@ -15,12 +15,13 @@ language:
|
|
15 |
<em>[Paper][Code][🤗] (would be released soon)</em>
|
16 |
</p>
|
17 |
|
18 |
-
Infinity-Instruct-3M-0613-Mistral-7B is an opensource supervised instruction tuning model without reinforcement learning from human feedback (RLHF). This model is just finetuned on Infinity-Instruct-3M and Infinity-Instruct-0613
|
19 |
|
20 |
## **Training Details**
|
21 |
<p align="center">
|
22 |
<img src="fig/trainingflow.png">
|
23 |
</p>
|
|
|
24 |
Infinity-Instruct-3M-0613-Mistral-7B is tuned on Million-level instruction dataset [Infinity-Instruct](https://huggingface.co/datasets/BAAI/Infinity-Instruct). First, we apply the foundational dataset Infinity-Instruct-3M to improve the foundational ability (math & code) of Mistral-7B-v0.1, and get the foundational instruct model Infinity-Instruct-3M-Mistral-7B. Then we finetune the Infinity-Instruct-3M-Mistral-7B to get the stronger chat model Infinity-Instruct-3M-0613-Mistral-7B. Here is the training hyperparamers.
|
25 |
|
26 |
```bash
|
|
|
15 |
<em>[Paper][Code][🤗] (would be released soon)</em>
|
16 |
</p>
|
17 |
|
18 |
+
Infinity-Instruct-3M-0613-Mistral-7B is an opensource supervised instruction tuning model without reinforcement learning from human feedback (RLHF). This model is just finetuned on [Infinity-Instruct-3M and Infinity-Instruct-0613](https://huggingface.co/datasets/BAAI/Infinity-Instruct) and it beats SOTA language models such as Mixtral 8x7B v0.1, Gemini Pro and GPT3.5 on AlpacaEval 2.0!
|
19 |
|
20 |
## **Training Details**
|
21 |
<p align="center">
|
22 |
<img src="fig/trainingflow.png">
|
23 |
</p>
|
24 |
+
|
25 |
Infinity-Instruct-3M-0613-Mistral-7B is tuned on Million-level instruction dataset [Infinity-Instruct](https://huggingface.co/datasets/BAAI/Infinity-Instruct). First, we apply the foundational dataset Infinity-Instruct-3M to improve the foundational ability (math & code) of Mistral-7B-v0.1, and get the foundational instruct model Infinity-Instruct-3M-Mistral-7B. Then we finetune the Infinity-Instruct-3M-Mistral-7B to get the stronger chat model Infinity-Instruct-3M-0613-Mistral-7B. Here is the training hyperparamers.
|
26 |
|
27 |
```bash
|