BengaliByteLevelBPETokenizerFast / tokenizer_config.json
faisaltareque's picture
Upload tokenizer
e3d89d4
raw
history blame contribute delete
234 Bytes
{
"clean_up_tokenization_spaces": true,
"model_max_length": 1024,
"special_tokens": [
"<s>",
"<pad>",
"</s>",
"<unk>",
"<cls>",
"<sep>",
"<mask>"
],
"tokenizer_class": "PreTrainedTokenizerFast"
}