Text Generation
Transformers
Safetensors
Czech
mpt
custom_code
text-generation-inference
Inference Endpoints
mfajcik commited on
Commit
c4899d6
1 Parent(s): 2f52d85

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -49,6 +49,11 @@ Figure 3: Test loss closeup, testing performed on split of internal-corpus #1. S
49
 
50
  ## Training Method
51
  ### Vocabulary Swap
 
 
 
 
 
52
  The vocabulary swap was done the same way as our [Czech-GPT-2](https://huggingface.co/BUT-FIT/Czech-GPT-2-XL-133k) model (check it out for comprehensive description.)
53
  We managed to align 4,177 english tokens with corresponding czech tokens.
54
 
 
49
 
50
  ## Training Method
51
  ### Vocabulary Swap
52
+ To transfer knowledge from English model to Czech, we developed a simple method that (i) aligns several tokens between two vocabularies and (ii) copies the embeddings from original language to new language.
53
+ <img src="figures/tllama_test.png" width="900"/>
54
+
55
+ Figure 4: Ablation: Test perplexity over the course of training for vocabulary swap method on TinyLLAMA. Our method (green curve) vs TinyLLAMA training from scratch (blue curve).
56
+
57
  The vocabulary swap was done the same way as our [Czech-GPT-2](https://huggingface.co/BUT-FIT/Czech-GPT-2-XL-133k) model (check it out for comprehensive description.)
58
  We managed to align 4,177 english tokens with corresponding czech tokens.
59