hezarai
/

t5-base-fa

arxyzan commited on Aug 2, 2023

Commit

2b2d98f

•

1 Parent(s): d957b41

Hezar: Upload model and config

Files changed (1) hide show

preprocessor/tokenizer_config.yaml ADDED Viewed

+name: sentencepiece_unigram_tokenizer
+config_type: preprocessor
+pretrained_path: t5-base-fa
+max_length: 512
+truncation_strategy: longest_first
+truncation_direction: right
+stride: 0
+padding_strategy: longest
+padding_direction: right
+pad_to_multiple_of: 0
+pad_token_id: 0
+pad_token: <pad>
+pad_token_type_id: 0
+unk_token: <unk>
+special_tokens:
+- <s>
+- <pad>
+- </s>
+- <unk>
+- <mask>
+- <|endoftext|>
+- <|startoftext|>
+- <nl>
+- <hs>
+- <sep>
+- <cls>
+continuing_subword_prefix: ''
+replacement: _
+add_prefix_space: true
+end_of_word_suffix: ''
+fuse_unk: false
+vocab_size: 32103
+min_frequency: 2
+limit_alphabet: 1000
+initial_alphabet: []
+show_progress: true