flair
/

bueble-lm-2b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

pdelobelle commited on 29 days ago

Commit

f486b26

•

1 Parent(s): 796a071

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ license: apache-2.0
     <p><em>A small German LM</em></p>
 </div>
-BübleLM is a German language model based on Gemma-2B, adapted using [trans-tokenization](https://pieter.ai/trans-tokenization/) with a custom German SentencePiece tokenizer. The model demonstrates how language-specific tokenization can significantly improve performance while maintaining the base model's capabilities.
 ## Model Details
@@ -47,12 +47,12 @@ Data sampling weights:
 ## Performance
-Key improvements over Gemma-2B baseline:
 - HellaSwag-DE: +71% (47.9% vs 28.0%)
 - ARC-DE: +41% (32.3% vs 22.9%)
 - Average zero-shot: +40% (35.8% vs 25.5%)
-→ BübleLM-2B onsistently outperforms both the base Gemma-2B and other German models like LLaMmlein-1B across most tasks.
 <table class="model-comparison">
   <thead>

     <p><em>A small German LM</em></p>
 </div>
+BübleLM is a German language model based on Gemma-2-2B, adapted using [trans-tokenization](https://pieter.ai/trans-tokenization/) with a custom German SentencePiece tokenizer. The model demonstrates how language-specific tokenization can significantly improve performance while maintaining the base model's capabilities.
 ## Model Details
 ## Performance
+Key improvements over Gemma-2-2B baseline:
 - HellaSwag-DE: +71% (47.9% vs 28.0%)
 - ARC-DE: +41% (32.3% vs 22.9%)
 - Average zero-shot: +40% (35.8% vs 25.5%)
+→ BübleLM-2B onsistently outperforms both the base Gemma-2-2B and other German models like LLaMmlein-1B across most tasks.
 <table class="model-comparison">
   <thead>