|
# paddle paddle版本的RoFormer |
|
|
|
# 需要安装最新的paddlenlp |
|
`pip install git+https://github.com/PaddlePaddle/PaddleNLP.git` |
|
|
|
## 预训练模型转换 |
|
|
|
预训练模型可以从 huggingface/transformers 转换而来,方法如下(适用于roformer模型,其他模型按情况调整): |
|
|
|
1. 从huggingface.co获取roformer模型权重 |
|
2. 设置参数运行convert.py代码 |
|
3. 例子: |
|
假设我想转换https://huggingface.co/junnyu/roformer_chinese_base 权重 |
|
- (1)首先下载 https://huggingface.co/junnyu/roformer_chinese_base/tree/main 中的pytorch_model.bin文件,假设我们存入了`./roformer_chinese_base/pytorch_model.bin` |
|
- (2)运行convert.py |
|
```bash |
|
python convert.py \ |
|
--pytorch_checkpoint_path ./roformer_chinese_base/pytorch_model.bin \ |
|
--paddle_dump_path ./roformer_chinese_base/model_state.pdparams |
|
``` |
|
- (3)最终我们得到了转化好的权重`./roformer_chinese_base/model_state.pdparams` |
|
|
|
|
|
## 预训练MLM测试 |
|
### test_mlm.py |
|
```python |
|
import paddle |
|
import argparse |
|
from paddlenlp.transformers import RoFormerForPretraining, RoFormerTokenizer |
|
|
|
def test_mlm(text, model_name): |
|
model = RoFormerForPretraining.from_pretrained(model_name) |
|
model.eval() |
|
tokenizer = RoFormerTokenizer.from_pretrained(model_name) |
|
tokens = ["[CLS]"] |
|
text_list = text.split("[MASK]") |
|
for i,t in enumerate(text_list): |
|
tokens.extend(tokenizer.tokenize(t)) |
|
if i==len(text_list)-1: |
|
tokens.extend(["[SEP]"]) |
|
else: |
|
tokens.extend(["[MASK]"]) |
|
|
|
input_ids_list = tokenizer.convert_tokens_to_ids(tokens) |
|
input_ids = paddle.to_tensor([input_ids_list]) |
|
|
|
with paddle.no_grad(): |
|
pd_outputs = model(input_ids)[0][0] |
|
pd_outputs_sentence = "paddle: " |
|
for i, id in enumerate(input_ids_list): |
|
if id == tokenizer.convert_tokens_to_ids(["[MASK]"])[0]: |
|
tokens = tokenizer.convert_ids_to_tokens(pd_outputs[i].topk(5)[1].tolist()) |
|
pd_outputs_sentence += "[" + "||".join(tokens) + "]" |
|
else: |
|
pd_outputs_sentence += "".join( |
|
tokenizer.convert_ids_to_tokens([id], skip_special_tokens=True) |
|
) |
|
print(pd_outputs_sentence) |
|
|
|
if __name__ == "__main__": |
|
parser = argparse.ArgumentParser() |
|
parser.add_argument( |
|
"--model_name", default="roformer-chinese-base", type=str, help="Pretrained roformer name or path." |
|
) |
|
parser.add_argument( |
|
"--text", default="今天[MASK]很好,我想去公园玩!", type=str, help="MLM text." |
|
) |
|
args = parser.parse_args() |
|
test_mlm(text=args.text, model_name=args.model_name) |
|
|
|
``` |
|
### 输出 |
|
```bash |
|
python test_mlm.py --model_name roformer-chinese-base --text 今天[MASK]很好,我想去公园玩! |
|
# paddle: 今天[天气||天||阳光||太阳||空气]很好,我想去公园玩! |
|
python test_mlm.py --model_name roformer-chinese-base --text 北京是[MASK]的首都! |
|
# paddle: 北京是[中国||谁||中华人民共和国||我们||中华民族]的首都! |
|
python test_mlm.py --model_name roformer-chinese-char-base --text 今天[MASK]很好,我想去公园玩! |
|
# paddle: 今天[天||气||都||风||人]很好,我想去公园玩! |
|
python test_mlm.py --model_name roformer-chinese-char-base --text 北京是[MASK]的首都! |
|
# paddle: 北京是[谁||我||你||他||国]的首都! |
|
``` |