pdelobelle
commited on
Commit
•
224f136
1
Parent(s):
2602fea
Update README.md
Browse files
README.md
CHANGED
@@ -23,6 +23,7 @@ Our tweety-7b-dutch model has an Apache 2.0 license, encouraging applications in
|
|
23 |
- **Tokenizer:** Dutch, 50k tokens ([yhavinga/gpt-neo-1.3B-dutch](https://huggingface.co/yhavinga/gpt-neo-1.3B-dutch))
|
24 |
- **Pre-training data:** Scraped Dutch ([yhavinga/mc4_nl_cleaned](https://huggingface.co/datasets/yhavinga/mc4_nl_cleaned))
|
25 |
- **Context window**: 8196 tokens
|
|
|
26 |
- **Developed by:** KU Leuven and UGent
|
27 |
- **Funded by:** KU Leuven BOF, VSC (Flemish Supercomputer Center), [Vlaams AI-onderzoeksprogramma](https://www.flandersairesearch.be/nl)
|
28 |
- **Model type:** Foundation model
|
@@ -35,7 +36,9 @@ As a base model, tweety-7b-dutch is primed for direct applications across text g
|
|
35 |
## Technical Specifications
|
36 |
|
37 |
### Compute Infrastructure
|
|
|
38 |
|
39 |
-
|
40 |
|
41 |
-
|
|
|
|
23 |
- **Tokenizer:** Dutch, 50k tokens ([yhavinga/gpt-neo-1.3B-dutch](https://huggingface.co/yhavinga/gpt-neo-1.3B-dutch))
|
24 |
- **Pre-training data:** Scraped Dutch ([yhavinga/mc4_nl_cleaned](https://huggingface.co/datasets/yhavinga/mc4_nl_cleaned))
|
25 |
- **Context window**: 8196 tokens
|
26 |
+
- **Training data**: 8.5B tokens
|
27 |
- **Developed by:** KU Leuven and UGent
|
28 |
- **Funded by:** KU Leuven BOF, VSC (Flemish Supercomputer Center), [Vlaams AI-onderzoeksprogramma](https://www.flandersairesearch.be/nl)
|
29 |
- **Model type:** Foundation model
|
|
|
36 |
## Technical Specifications
|
37 |
|
38 |
### Compute Infrastructure
|
39 |
+
Training utilized Nvidia H100 and A100 GPUs. Inference is accessible on lower-end GPUs, basically any GPU capable of running mistral models.
|
40 |
|
41 |
+
### Model Weights
|
42 |
|
43 |
+
- This model was trained in bfloat16.
|
44 |
+
- [GGUF weights](https://huggingface.co/BramVanroy/tweety-7b-dutch-v24a-GGUF) are released by Bram Vanroy.
|