loubnabnl HF staff commited on
Commit
68d9f23
·
verified ·
1 Parent(s): 09ec5a3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -15,9 +15,6 @@ We also included [ultrachat](https://huggingface.co/datasets/stingning/ultrachat
15
 
16
  We trained for 6 epochs, resulting in a model trained on 180B tokens with a sequence length of 2k, a global batch size of 1.3M tokens and a learning rate of 3e-4 with a cosine schedule for 14àk steps.
17
  We used the tokenizer from [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1/).
18
- The training loss:
19
-
20
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61c141342aac764ce1654e43/rJobY7F6tqTAvIox1ZGKR.png)
21
 
22
  # How to use
23
 
@@ -82,3 +79,6 @@ Thsi is a small 1.8B model trained on synthetic data, so it might hallucinate, g
82
  - **GPUs:** 160 H100
83
  - **Training time:** 15hours
84
 
 
 
 
 
15
 
16
  We trained for 6 epochs, resulting in a model trained on 180B tokens with a sequence length of 2k, a global batch size of 1.3M tokens and a learning rate of 3e-4 with a cosine schedule for 14àk steps.
17
  We used the tokenizer from [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1/).
 
 
 
18
 
19
  # How to use
20
 
 
79
  - **GPUs:** 160 H100
80
  - **Training time:** 15hours
81
 
82
+ The training loss:
83
+
84
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61c141342aac764ce1654e43/rJobY7F6tqTAvIox1ZGKR.png)