afrideva commited on
Commit
7e7ca0a
1 Parent(s): b9d93eb

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +84 -0
README.md ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: KennethTM/gpt2-small-danish
3
+ datasets:
4
+ - oscar
5
+ inference: false
6
+ language:
7
+ - da
8
+ model_creator: KennethTM
9
+ model_name: gpt2-small-danish
10
+ pipeline_tag: text-generation
11
+ quantized_by: afrideva
12
+ tags:
13
+ - gguf
14
+ - ggml
15
+ - quantized
16
+ - q2_k
17
+ - q3_k_m
18
+ - q4_k_m
19
+ - q5_k_m
20
+ - q6_k
21
+ - q8_0
22
+ widget:
23
+ - text: Der var engang
24
+ ---
25
+ # KennethTM/gpt2-small-danish-GGUF
26
+
27
+ Quantized GGUF model files for [gpt2-small-danish](https://huggingface.co/KennethTM/gpt2-small-danish) from [KennethTM](https://huggingface.co/KennethTM)
28
+
29
+
30
+ | Name | Quant method | Size |
31
+ | ---- | ---- | ---- |
32
+ | [gpt2-small-danish.fp16.gguf](https://huggingface.co/afrideva/gpt2-small-danish-GGUF/resolve/main/gpt2-small-danish.fp16.gguf) | fp16 | 328.21 MB |
33
+ | [gpt2-small-danish.q2_k.gguf](https://huggingface.co/afrideva/gpt2-small-danish-GGUF/resolve/main/gpt2-small-danish.q2_k.gguf) | q2_k | 81.30 MB |
34
+ | [gpt2-small-danish.q3_k_m.gguf](https://huggingface.co/afrideva/gpt2-small-danish-GGUF/resolve/main/gpt2-small-danish.q3_k_m.gguf) | q3_k_m | 95.56 MB |
35
+ | [gpt2-small-danish.q4_k_m.gguf](https://huggingface.co/afrideva/gpt2-small-danish-GGUF/resolve/main/gpt2-small-danish.q4_k_m.gguf) | q4_k_m | 110.27 MB |
36
+ | [gpt2-small-danish.q5_k_m.gguf](https://huggingface.co/afrideva/gpt2-small-danish-GGUF/resolve/main/gpt2-small-danish.q5_k_m.gguf) | q5_k_m | 124.20 MB |
37
+ | [gpt2-small-danish.q6_k.gguf](https://huggingface.co/afrideva/gpt2-small-danish-GGUF/resolve/main/gpt2-small-danish.q6_k.gguf) | q6_k | 136.02 MB |
38
+ | [gpt2-small-danish.q8_0.gguf](https://huggingface.co/afrideva/gpt2-small-danish-GGUF/resolve/main/gpt2-small-danish.q8_0.gguf) | q8_0 | 175.47 MB |
39
+
40
+
41
+
42
+ ## Original Model Card:
43
+ # What is this?
44
+
45
+ A GPT-2 model (small version, 124 M parameters) for Danish text generation. The model was not pre-trained from scratch but adapted from the English version.
46
+
47
+ # How to use
48
+
49
+ Test the model using the pipeline from the [🤗 Transformers](https://github.com/huggingface/transformers) library:
50
+
51
+ ```python
52
+ from transformers import pipeline
53
+
54
+ generator = pipeline("text-generation", model = "KennethTM/gpt2-small-danish")
55
+ text = generator("Manden arbejdede som")
56
+
57
+ print(text[0]["generated_text"])
58
+ ```
59
+
60
+ Or load it using the Auto* classes:
61
+
62
+ ```python
63
+ from transformers import AutoTokenizer, AutoModelForCausalLM
64
+
65
+ tokenizer = AutoTokenizer.from_pretrained("KennethTM/gpt2-small-danish")
66
+ model = AutoModelForCausalLM.from_pretrained("KennethTM/gpt2-small-danish")
67
+ ```
68
+
69
+ # Model training
70
+
71
+ The model is trained using the Danish part of the [oscar dataset](https://huggingface.co/datasets/oscar) ('unshuffled_deduplicated_da') and a context length of 1024 tokens.
72
+
73
+ The model weights are initialized from the English [GPT-2 small model](https://huggingface.co/gpt2) with new word token embeddings created for Danish using [WECHSEL](https://github.com/CPJKU/wechsel).
74
+
75
+ Initially, only the word token embeddings are trained using 50.000 samples. Finally, the whole model is trained using 1.000.000 samples.
76
+
77
+ For reference, the model achieves a perplexity of 33.5 on 5.000 random validation samples.
78
+
79
+
80
+ Model training is carried out on an 8 GB GPU.
81
+
82
+ # Notes
83
+
84
+ This is a pre-trained model, for optimal performance it should be finetuned for new tasks.