File size: 143 Bytes
84c361f
 
 
 
492f8a3
3315142
1
2
3
4
5
6
7


# summary
multilingual tokenizer trained on multilingual data by using the SentencePiece library and the BPE algorithm. 

* vocab size: 100k