hezarai
/

bert-fa-mask-filling

Model card Files Files and versions Community

bert-fa-mask-filling / preprocessor /tokenizer_config.yaml

arxyzan's picture

Update preprocessor/tokenizer_config.yaml

5920716 verified 16 days ago

history blame contribute delete

425 Bytes

	name: wordpiece_tokenizer
	config_type: preprocessor
	max_length: 512
	truncation: longest_first
	truncation_side: right
	stride: 0
	padding: longest
	padding_side: right
	pad_to_multiple_of: 0
	pad_token_type_id: 0
	unk_token: '[UNK]'
	sep_token: '[SEP]'
	pad_token: '[PAD]'
	cls_token: '[CLS]'
	mask_token: '[MASK]'
	wordpieces_prefix: '##'
	vocab_size: 42000
	min_frequency: 2
	limit_alphabet: 1000
	initial_alphabet: []
	show_progress: true