Update README.md
Browse files
README.md
CHANGED
@@ -18,7 +18,7 @@ pipeline_tag: text-generation
|
|
18 |
|
19 |
**Occiglot-7B-EU5** is a generative language model with 7B parameters supporting the top-5 EU languages (English, Spanish, French, German, and Italian) and trained by the [German Research Center for Artificial Intelligence (DFKI)](https://www.dfki.de/en/web).
|
20 |
It is based on [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) and trained on 293B tokens of additional multilingual and code data with a block size of 8,192 tokens per sample.
|
21 |
-
Note that the model is a general-purpose base model and was not instruction-fine-tuned nor optimized for chat or other applications.
|
22 |
|
23 |
This is the first release of an ongoing open research project for multilingual language models.
|
24 |
If you want to train a model for your own language or are working on evaluations, please contact us or join our [Discord server](https://discord.gg/wUpvYs4XvM). **We are open for collaborations!**
|
@@ -33,7 +33,7 @@ If you want to train a model for your own language or are working on evaluations
|
|
33 |
- **Compute resources:** [HessianAI's 42](https://hessian.ai/)
|
34 |
- **Contributors:** Manuel Brack, Patrick Schramowski, Pedro Ortiz, Malte Ostendorff, Fabio Barth, Georg Rehm, Kristian Kersting
|
35 |
- **Research labs:** [SAINT](https://www.dfki.de/en/web/research/research-departments/foundations-of-systems-ai) and [SLT](https://www.dfki.de/en/web/research/research-departments/speech-and-language-technology)
|
36 |
-
- **Contact:** [Discord](https://discord.gg/wUpvYs4XvM)
|
37 |
|
38 |
### How to use
|
39 |
|
@@ -83,6 +83,9 @@ Preliminary evaluation results can be found below.
|
|
83 |
Please note that the non-English results are based on partially machine-translated datasets and English prompts ([Belebele](https://huggingface.co/datasets/facebook/belebele) and [Okapi framework](https://github.com/nlp-uoregon/Okapi)) and thus should be interpreted with caution, e.g., biased towards English model performance.
|
84 |
Currently, we are working on more suitable benchmarks for Spanish, French, German, and Italian.
|
85 |
|
|
|
|
|
|
|
86 |
### All languages
|
87 |
|
88 |
| **model_name** | **arc_challenge** | **hellaswag** | **belebele** | **mmlu** | **avg** |
|
@@ -131,7 +134,7 @@ Currently, we are working on more suitable benchmarks for Spanish, French, Germa
|
|
131 |
| leo-mistral-hessianai-7b | 0.4328 | 0.5580 | 0.5967 | 0.4311 | 0.5047 |
|
132 |
| Occiglot-7B-EU5 | 0.5013 | 0.7008 | 0.6522 | 0.4949 | 0.5873 |
|
133 |
|
134 |
-
|
135 |
|
136 |
## Acknowledgements
|
137 |
|
@@ -147,5 +150,5 @@ through the project [OpenGPT-X](https://opengpt-x.de/en/) (project no. 68GX21007
|
|
147 |
## See also
|
148 |
|
149 |
- https://huggingface.co/NikolayKozloff/occiglot-7b-eu5-GGUF
|
150 |
-
|
151 |
|
|
|
18 |
|
19 |
**Occiglot-7B-EU5** is a generative language model with 7B parameters supporting the top-5 EU languages (English, Spanish, French, German, and Italian) and trained by the [German Research Center for Artificial Intelligence (DFKI)](https://www.dfki.de/en/web).
|
20 |
It is based on [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) and trained on 293B tokens of additional multilingual and code data with a block size of 8,192 tokens per sample.
|
21 |
+
Note that the model is a general-purpose base model and was not instruction-fine-tuned nor optimized for chat or other applications. We make an instruction tuned variant available as [occiglot-7b-eu5-instruct](https://huggingface.co/occiglot/occiglot-7b-eu5-instruct)
|
22 |
|
23 |
This is the first release of an ongoing open research project for multilingual language models.
|
24 |
If you want to train a model for your own language or are working on evaluations, please contact us or join our [Discord server](https://discord.gg/wUpvYs4XvM). **We are open for collaborations!**
|
|
|
33 |
- **Compute resources:** [HessianAI's 42](https://hessian.ai/)
|
34 |
- **Contributors:** Manuel Brack, Patrick Schramowski, Pedro Ortiz, Malte Ostendorff, Fabio Barth, Georg Rehm, Kristian Kersting
|
35 |
- **Research labs:** [SAINT](https://www.dfki.de/en/web/research/research-departments/foundations-of-systems-ai) and [SLT](https://www.dfki.de/en/web/research/research-departments/speech-and-language-technology)
|
36 |
+
- **Contact:** [Discord](https://discord.gg/wUpvYs4XvM) [hello@occiglot.org](mailto:hello@occiglot.org)
|
37 |
|
38 |
### How to use
|
39 |
|
|
|
83 |
Please note that the non-English results are based on partially machine-translated datasets and English prompts ([Belebele](https://huggingface.co/datasets/facebook/belebele) and [Okapi framework](https://github.com/nlp-uoregon/Okapi)) and thus should be interpreted with caution, e.g., biased towards English model performance.
|
84 |
Currently, we are working on more suitable benchmarks for Spanish, French, German, and Italian.
|
85 |
|
86 |
+
<details>
|
87 |
+
<summary>Evaluation results</summary>
|
88 |
+
|
89 |
### All languages
|
90 |
|
91 |
| **model_name** | **arc_challenge** | **hellaswag** | **belebele** | **mmlu** | **avg** |
|
|
|
134 |
| leo-mistral-hessianai-7b | 0.4328 | 0.5580 | 0.5967 | 0.4311 | 0.5047 |
|
135 |
| Occiglot-7B-EU5 | 0.5013 | 0.7008 | 0.6522 | 0.4949 | 0.5873 |
|
136 |
|
137 |
+
</details>
|
138 |
|
139 |
## Acknowledgements
|
140 |
|
|
|
150 |
## See also
|
151 |
|
152 |
- https://huggingface.co/NikolayKozloff/occiglot-7b-eu5-GGUF
|
153 |
+
- https://huggingface.co/collections/occiglot/occiglot-eu5-7b-v01-65dbed502a6348b052695e01
|
154 |
|