sambanovasystems
/

SambaLingo-Turkish-Chat

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

zolicsaki commited on Feb 22, 2024

Commit

8a89b81

·

verified ·

1 Parent(s): cfe3a79

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -47,6 +47,9 @@ model = AutoModelForCausalLM.from_pretrained("sambanovasystems/SambaLingo-Turkis
 ## Training Details
 ## Uses
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

 ## Training Details
+## Tokenizer Details
+We extended the vocabulary of the base llama model from 32,000 tokens to 57,000 tokens by adding up to 25,000 non-overlapping tokens from the new language.
 ## Uses
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->