Rijgersberg commited on
Commit
82a7864
•
1 Parent(s): 76ba2c6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -13
README.md CHANGED
@@ -15,29 +15,40 @@ language:
15
  - nl
16
  pipeline_tag: conversational
17
  ---
 
18
 
19
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
20
- should probably proofread and complete it, then remove this comment. -->
21
-
22
- #TODO model card
23
 
24
- # GEITje-7B-chat-v2
 
 
25
 
26
- This model is a fine-tuned version of [Rijgersberg/GEITje-7B](https://huggingface.co/Rijgersberg/GEITje-7B) on the None dataset.
27
- It achieves the following results on the evaluation set:
28
- - Loss: 0.8011
29
 
30
  ## Model description
31
 
32
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
33
 
34
- ## Intended uses & limitations
 
 
35
 
36
- More information needed
37
 
38
- ## Training and evaluation data
 
39
 
40
- More information needed
41
 
42
  ## Training procedure
43
 
 
15
  - nl
16
  pipeline_tag: conversational
17
  ---
18
+ # GEITje-7B-chat-v2
19
 
20
+ # GEITje-7B
 
 
 
21
 
22
+ GEITje is a large open Dutch language model with 7 billion parameters, based on Mistral 7B.
23
+ It has been further trained on 10 billion tokens of Dutch text.
24
+ This has improved its Dutch language skills and increased its knowledge of Dutch topics.
25
 
 
 
 
26
 
27
  ## Model description
28
 
29
+ ### _Mistral_ – Base Model
30
+ GEITje is based on [Mistral 7B](https://mistral.ai/news/announcing-mistral-7b/).
31
+ It's a large open language model with 7 billion parameters,
32
+ trained by [Mistral AI](https://mistral.ai).
33
+ According to Mistral AI, the 7B model performs better than [Llama 2](https://ai.meta.com/llama/) 13B on all (English-language) benchmarks they tested it on.
34
+ Mistral 7B has been released under the Apache 2.0 open source license.
35
+
36
+
37
+ ### _GEITje_ – Trained Further on Dutch Texts
38
+ GEITje was created by further training Mistral 7B on no less than 10 billion tokens of Dutch text from the [Dutch Gigacorpus](http://gigacorpus.nl) and the [MADLAD-400](https://huggingface.co/datasets/allenai/MADLAD-400) web crawling corpus.
39
+ It is a so-called _full-parameter finetune_:
40
+ performed on all parameters.
41
+ It is not a [PEFT](https://huggingface.co/blog/peft) or [LoRA](https://huggingface.co/docs/peft/conceptual_guides/lora) finetune.
42
+ Like Mistral, GEITje has a _context length_ of 8,192 tokens.
43
 
44
+ ### _GEITje-chat_ – Finetuned for Dialogues
45
+ As a demonstration of GEITje's capabilities for chat applications, two initial chat variants of GEITje have also been finetuned: GEITje-chat and GEITje-chat-v2.
46
+ They can follow instructions, answer questions, and hold dialogues on a variety of topics.
47
 
 
48
 
49
+ ## More info
50
+ Read more about GEITje-chat in the [📄 README](https://github.com/Rijgersberg/GEITje/blob/main/README-en.md) on GitHub.
51
 
 
52
 
53
  ## Training procedure
54