pdelobelle commited on
Commit
f486b26
1 Parent(s): 796a071

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -19,7 +19,7 @@ license: apache-2.0
19
  <p><em>A small German LM</em></p>
20
  </div>
21
 
22
- BübleLM is a German language model based on Gemma-2B, adapted using [trans-tokenization](https://pieter.ai/trans-tokenization/) with a custom German SentencePiece tokenizer. The model demonstrates how language-specific tokenization can significantly improve performance while maintaining the base model's capabilities.
23
 
24
  ## Model Details
25
 
@@ -47,12 +47,12 @@ Data sampling weights:
47
 
48
  ## Performance
49
 
50
- Key improvements over Gemma-2B baseline:
51
  - HellaSwag-DE: +71% (47.9% vs 28.0%)
52
  - ARC-DE: +41% (32.3% vs 22.9%)
53
  - Average zero-shot: +40% (35.8% vs 25.5%)
54
 
55
- → BübleLM-2B onsistently outperforms both the base Gemma-2B and other German models like LLaMmlein-1B across most tasks.
56
 
57
  <table class="model-comparison">
58
  <thead>
 
19
  <p><em>A small German LM</em></p>
20
  </div>
21
 
22
+ BübleLM is a German language model based on Gemma-2-2B, adapted using [trans-tokenization](https://pieter.ai/trans-tokenization/) with a custom German SentencePiece tokenizer. The model demonstrates how language-specific tokenization can significantly improve performance while maintaining the base model's capabilities.
23
 
24
  ## Model Details
25
 
 
47
 
48
  ## Performance
49
 
50
+ Key improvements over Gemma-2-2B baseline:
51
  - HellaSwag-DE: +71% (47.9% vs 28.0%)
52
  - ARC-DE: +41% (32.3% vs 22.9%)
53
  - Average zero-shot: +40% (35.8% vs 25.5%)
54
 
55
+ → BübleLM-2B onsistently outperforms both the base Gemma-2-2B and other German models like LLaMmlein-1B across most tasks.
56
 
57
  <table class="model-comparison">
58
  <thead>