Update README.md
Browse files
README.md
CHANGED
@@ -128,7 +128,7 @@ The results show that EuroLLM-1.7B is substantially better than Gemma-2B in Mach
|
|
128 |
|Gemma-7B-EuroBlocks| 80.88|80.45|82.60|80.43|81.91|80.14|80.32|82.17|84.08|81.86|72.71|85.55|79.65|
|
129 |
|
130 |
### General Benchmarks
|
131 |
-
We also compare EuroLLM-1.7B with [TinyLlama-
|
132 |
For the non-english languages we use the [Okapi](https://aclanthology.org/2023.emnlp-demo.28.pdf) datasets.
|
133 |
Results show that EuroLLM-1.7B is superior to TinyLlama-1.1-3T and similar to Gemma-2B on Hellaswag but worse on Arc Challenge. This can be due to the lower number of parameters of EuroLLM-1.7B (1.133B non-embedding parameters against 1.981B).
|
134 |
|
@@ -141,6 +141,6 @@ Results show that EuroLLM-1.7B is superior to TinyLlama-1.1-3T and similar to Ge
|
|
141 |
#### Hellaswag
|
142 |
| Model | Average | English | German | Spanish | French | Italian | Portuguese | Russian | Dutch | Arabic | Swedish | Hindi | Hungarian | Romanian | Ukrainian | Danish | Catalan |
|
143 |
|--------------------|---------|---------|--------|---------|--------|---------|------------|---------|--------|--------|---------|--------|-----------|----------|-----------|--------|---------|
|
144 |
-
| 0.4744 | 0.4654 | 0.6084 | 0.4772 | 0.5310 | 0.5260 | 0.5067 | 0.5206 | 0.4674 | 0.4893 | 0.4075 | 0.4813 | 0.3605 | 0.4067 | 0.4598 | 0.4368 | 0.4700 | 0.4405 |
|
145 |
-
| 0.3674 | 0.3503 | 0.6248 | 0.3650 | 0.4137 | 0.4010 | 0.3780 | 0.3892 | 0.3494 | 0.3588 | 0.2880 | 0.3561 | 0.2841 | 0.3073 | 0.3267 | 0.3349 | 0.3408 | 0.3613 |
|
146 |
-
| 0.4666 | 0.4499 | 0.7165 | 0.4756 | 0.5414 | 0.5180 | 0.4841 | 0.5081 | 0.4664 | 0.4655 | 0.3868 | 0.4383 | 0.3413 | 0.3710 | 0.4316 | 0.4291 | 0.4471 | 0.4448 |
|
|
|
128 |
|Gemma-7B-EuroBlocks| 80.88|80.45|82.60|80.43|81.91|80.14|80.32|82.17|84.08|81.86|72.71|85.55|79.65|
|
129 |
|
130 |
### General Benchmarks
|
131 |
+
We also compare EuroLLM-1.7B with [TinyLlama-v1.1](https://huggingface.co/TinyLlama/TinyLlama_v1.1) and [Gemma-2B](https://huggingface.co/google/gemma-2b) on 3 general benchmarks: Arc Challenge and Hellaswag.
|
132 |
For the non-english languages we use the [Okapi](https://aclanthology.org/2023.emnlp-demo.28.pdf) datasets.
|
133 |
Results show that EuroLLM-1.7B is superior to TinyLlama-1.1-3T and similar to Gemma-2B on Hellaswag but worse on Arc Challenge. This can be due to the lower number of parameters of EuroLLM-1.7B (1.133B non-embedding parameters against 1.981B).
|
134 |
|
|
|
141 |
#### Hellaswag
|
142 |
| Model | Average | English | German | Spanish | French | Italian | Portuguese | Russian | Dutch | Arabic | Swedish | Hindi | Hungarian | Romanian | Ukrainian | Danish | Catalan |
|
143 |
|--------------------|---------|---------|--------|---------|--------|---------|------------|---------|--------|--------|---------|--------|-----------|----------|-----------|--------|---------|
|
144 |
+
| EuroLLM-1.7B-Instruct | 0.4744 | 0.4654 | 0.6084 | 0.4772 | 0.5310 | 0.5260 | 0.5067 | 0.5206 | 0.4674 | 0.4893 | 0.4075 | 0.4813 | 0.3605 | 0.4067 | 0.4598 | 0.4368 | 0.4700 | 0.4405 |
|
145 |
+
| TinyLlama-v1.1 |0.3674 | 0.3503 | 0.6248 | 0.3650 | 0.4137 | 0.4010 | 0.3780 | 0.3892 | 0.3494 | 0.3588 | 0.2880 | 0.3561 | 0.2841 | 0.3073 | 0.3267 | 0.3349 | 0.3408 | 0.3613 |
|
146 |
+
| Gemma-2B |0.4666 | 0.4499 | 0.7165 | 0.4756 | 0.5414 | 0.5180 | 0.4841 | 0.5081 | 0.4664 | 0.4655 | 0.3868 | 0.4383 | 0.3413 | 0.3710 | 0.4316 | 0.4291 | 0.4471 | 0.4448 |
|