Rijgersberg
commited on
Commit
•
82a7864
1
Parent(s):
76ba2c6
Update README.md
Browse files
README.md
CHANGED
@@ -15,29 +15,40 @@ language:
|
|
15 |
- nl
|
16 |
pipeline_tag: conversational
|
17 |
---
|
|
|
18 |
|
19 |
-
|
20 |
-
should probably proofread and complete it, then remove this comment. -->
|
21 |
-
|
22 |
-
#TODO model card
|
23 |
|
24 |
-
|
|
|
|
|
25 |
|
26 |
-
This model is a fine-tuned version of [Rijgersberg/GEITje-7B](https://huggingface.co/Rijgersberg/GEITje-7B) on the None dataset.
|
27 |
-
It achieves the following results on the evaluation set:
|
28 |
-
- Loss: 0.8011
|
29 |
|
30 |
## Model description
|
31 |
|
32 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
|
34 |
-
|
|
|
|
|
35 |
|
36 |
-
More information needed
|
37 |
|
38 |
-
##
|
|
|
39 |
|
40 |
-
More information needed
|
41 |
|
42 |
## Training procedure
|
43 |
|
|
|
15 |
- nl
|
16 |
pipeline_tag: conversational
|
17 |
---
|
18 |
+
# GEITje-7B-chat-v2
|
19 |
|
20 |
+
# GEITje-7B
|
|
|
|
|
|
|
21 |
|
22 |
+
GEITje is a large open Dutch language model with 7 billion parameters, based on Mistral 7B.
|
23 |
+
It has been further trained on 10 billion tokens of Dutch text.
|
24 |
+
This has improved its Dutch language skills and increased its knowledge of Dutch topics.
|
25 |
|
|
|
|
|
|
|
26 |
|
27 |
## Model description
|
28 |
|
29 |
+
### _Mistral_ – Base Model
|
30 |
+
GEITje is based on [Mistral 7B](https://mistral.ai/news/announcing-mistral-7b/).
|
31 |
+
It's a large open language model with 7 billion parameters,
|
32 |
+
trained by [Mistral AI](https://mistral.ai).
|
33 |
+
According to Mistral AI, the 7B model performs better than [Llama 2](https://ai.meta.com/llama/) 13B on all (English-language) benchmarks they tested it on.
|
34 |
+
Mistral 7B has been released under the Apache 2.0 open source license.
|
35 |
+
|
36 |
+
|
37 |
+
### _GEITje_ – Trained Further on Dutch Texts
|
38 |
+
GEITje was created by further training Mistral 7B on no less than 10 billion tokens of Dutch text from the [Dutch Gigacorpus](http://gigacorpus.nl) and the [MADLAD-400](https://huggingface.co/datasets/allenai/MADLAD-400) web crawling corpus.
|
39 |
+
It is a so-called _full-parameter finetune_:
|
40 |
+
performed on all parameters.
|
41 |
+
It is not a [PEFT](https://huggingface.co/blog/peft) or [LoRA](https://huggingface.co/docs/peft/conceptual_guides/lora) finetune.
|
42 |
+
Like Mistral, GEITje has a _context length_ of 8,192 tokens.
|
43 |
|
44 |
+
### _GEITje-chat_ – Finetuned for Dialogues
|
45 |
+
As a demonstration of GEITje's capabilities for chat applications, two initial chat variants of GEITje have also been finetuned: GEITje-chat and GEITje-chat-v2.
|
46 |
+
They can follow instructions, answer questions, and hold dialogues on a variety of topics.
|
47 |
|
|
|
48 |
|
49 |
+
## More info
|
50 |
+
Read more about GEITje-chat in the [📄 README](https://github.com/Rijgersberg/GEITje/blob/main/README-en.md) on GitHub.
|
51 |
|
|
|
52 |
|
53 |
## Training procedure
|
54 |
|