Rijgersberg
/

GEITje-7B-chat-v2

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Rijgersberg commited on Dec 18, 2023

Commit

82a7864

·

1 Parent(s): 76ba2c6

Update README.md

Files changed (1) hide show

README.md +24 -13

README.md CHANGED Viewed

@@ -15,29 +15,40 @@ language:
 - nl
 pipeline_tag: conversational
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-#TODO model card
-# GEITje-7B-chat-v2
-This model is a fine-tuned version of [Rijgersberg/GEITje-7B](https://huggingface.co/Rijgersberg/GEITje-7B) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.8011
 ## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure

 - nl
 pipeline_tag: conversational
 ---
+# GEITje-7B-chat-v2
+# GEITje-7B
+GEITje is a large open Dutch language model with 7 billion parameters, based on Mistral 7B.
+It has been further trained on 10 billion tokens of Dutch text.
+This has improved its Dutch language skills and increased its knowledge of Dutch topics.
 ## Model description
+### _Mistral_ – Base Model
+GEITje is based on [Mistral 7B](https://mistral.ai/news/announcing-mistral-7b/).
+It's a large open language model with 7 billion parameters,
+trained by [Mistral AI](https://mistral.ai).
+According to Mistral AI, the 7B model performs better than [Llama 2](https://ai.meta.com/llama/) 13B on all (English-language) benchmarks they tested it on.
+Mistral 7B has been released under the Apache 2.0 open source license.
+### _GEITje_ – Trained Further on Dutch Texts
+GEITje was created by further training Mistral 7B on no less than 10 billion tokens of Dutch text from the [Dutch Gigacorpus](http://gigacorpus.nl) and the [MADLAD-400](https://huggingface.co/datasets/allenai/MADLAD-400) web crawling corpus.
+It is a so-called _full-parameter finetune_:
+performed on all parameters.
+It is not a [PEFT](https://huggingface.co/blog/peft) or [LoRA](https://huggingface.co/docs/peft/conceptual_guides/lora) finetune.
+Like Mistral, GEITje has a _context length_ of 8,192 tokens.
+### _GEITje-chat_ – Finetuned for Dialogues
+As a demonstration of GEITje's capabilities for chat applications, two initial chat variants of GEITje have also been finetuned: GEITje-chat and GEITje-chat-v2.
+They can follow instructions, answer questions, and hold dialogues on a variety of topics.
+## More info
+Read more about GEITje-chat in the [📄 README](https://github.com/Rijgersberg/GEITje/blob/main/README-en.md) on GitHub.
 ## Training procedure