ChineseBERT-large / README.md
iioSnail's picture
Update README.md
350968a
|
raw
history blame
1.98 kB
metadata
license: afl-3.0
language:
  - zh
tags:
  - bert
  - chinesebert
  - MLM
pipeline_tag: fill-mask

ChineseBERT-large

本项目是将ChineseBERT进行了加工,可供使用者直接使用HuggingFace API进行调用,无需再进行多余的代码配置。

原论文地址: ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information Zijun Sun, Xiaoya Li, Xiaofei Sun, Yuxian Meng, Xiang Ao, Qing He, Fei Wu and Jiwei Li

原项目地址: ChineseBERT github link

原模型地址: ShannonAI/ChineseBERT-base (该模型无法直接使用HuggingFace API调用)

本项目使用方法

Open In Colab

  1. 安装pypinyin
pip install pypinyin
  1. 使用AutoClass加载tokenizer和model
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("iioSnail/ChineseBERT-large", trust_remote_code=True)
model = AutoModel.from_pretrained("iioSnail/ChineseBERT-large", trust_remote_code=True)
  1. 之后与普通BERT使用方法一致
inputs = tokenizer(["我 喜 [MASK] 猫"], return_tensors='pt')
logits = model(**inputs).logits

print(tokenizer.decode(logits.argmax(-1)[0, 1:-1]))

输出:

tokenizer.decode(logits.argmax(-1)[0, 1:-1])

常见问题

  1. 网络问题,例如:Connection Error

解决方案:将模型下载到本地使用。批量下载方案可参考该博客

  1. 将模型下载到本地使用时出现报错:ModuleNotFoundError: No module named 'transformers_modules.iioSnail/ChineseBERT-large'

解决方案:将 iioSnail/ChineseBERT-large 改为 iioSnail\ChineseBERT-large