metinovadilet commited on
Commit
d8b92a1
·
verified ·
1 Parent(s): 574c927

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -8,7 +8,7 @@ tags:
8
  - kyrgyz
9
  - tokenizer
10
  ---
11
- A tokenizer tailored for the Kyrgyz language, utilizing SentencePiece with Byte Pair Encoding (BPE) to offer efficient and precise tokenization. It features a 50,000-subword vocabulary, ensuring optimal performance for various Kyrgyz NLP tasks. This tokenizer was developed in collaboration with UlutSoft LLC to reflect authentic Kyrgyz language usage.
12
  Features:
13
 
14
  Language: Kyrgyz
 
8
  - kyrgyz
9
  - tokenizer
10
  ---
11
+ A tokenizer tailored for the Kyrgyz language, utilizing SentencePiece with Byte Pair Encoding (BPE) to offer efficient and precise tokenization. It features a 100,000-subword vocabulary, ensuring optimal performance for various Kyrgyz NLP tasks. This tokenizer was developed in collaboration with UlutSoft LLC to reflect authentic Kyrgyz language usage.
12
  Features:
13
 
14
  Language: Kyrgyz