BERT base Japanese (character-level tokenization with whole word masking, jawiki-20200831)

This pretrained model is almost the same as cl-tohoku/bert-base-japanese-char-v2 but do not need fugashi or unidic_lite. The only difference is in word_tokenzer_type property (specify basic instead of mecab) in tokenizer_config.json.

Downloads last month: 27

Inference Providers NEW

Fill-Mask

This model is not currently available via any of the supported Inference Providers.

hiroshi-matsuda-rit
/

bert-base-japanese-basic-char-v2

BERT base Japanese (character-level tokenization with whole word masking, jawiki-20200831)

Dataset used to train hiroshi-matsuda-rit/bert-base-japanese-basic-char-v2