Update README.md
Browse files
README.md
CHANGED
@@ -6,11 +6,11 @@ widget:
|
|
6 |
|
7 |
---
|
8 |
|
9 |
-
# Chinese RoBERTa
|
10 |
|
11 |
## Model description
|
12 |
|
13 |
-
The model is used for extractive question answering. You can download the model
|
14 |
|
15 |
## How to use
|
16 |
|
@@ -27,20 +27,19 @@ You can use the model directly with a pipeline for extractive question answering
|
|
27 |
|
28 |
## Training data
|
29 |
|
30 |
-
Training data
|
31 |
|
32 |
## Training procedure
|
33 |
|
34 |
The model is fine-tuned by [UER-py](https://github.com/dbiir/UER-py/) on [Tencent Cloud TI-ONE](https://cloud.tencent.com/product/tione/). We fine-tune three epochs with a sequence length of 512 on the basis of the pre-trained model [chinese_roberta_L-12_H-768](https://huggingface.co/uer/chinese_roberta_L-12_H-768).
|
35 |
|
36 |
```
|
37 |
-
python3 run_cmrc.py --
|
38 |
-
--pretrained_model_path models/cluecorpussmall_roberta_base_seq512_model.bin-250000 \
|
39 |
--vocab_path models/google_zh_vocab.txt \
|
40 |
--train_path extractive_qa.json \
|
41 |
--dev_path datasets/cmrc2018/dev.json \
|
42 |
--output_model_path models/extractive_qa_model.bin \
|
43 |
-
--learning_rate 3e-5 --batch_size 32 --epochs_num 3 \
|
44 |
--embedding word_pos_seg --encoder transformer --mask fully_visible
|
45 |
```
|
46 |
|
|
|
6 |
|
7 |
---
|
8 |
|
9 |
+
# Chinese RoBERTa-Base Model for QA
|
10 |
|
11 |
## Model description
|
12 |
|
13 |
+
The model is used for extractive question answering. You can download the model from the link [roberta-base-chinese-extractive-qa](https://huggingface.co/uer/roberta-base-chinese-extractive-qa).
|
14 |
|
15 |
## How to use
|
16 |
|
|
|
27 |
|
28 |
## Training data
|
29 |
|
30 |
+
Training data comes from three sources: [cmrc2018](https://github.com/ymcui/cmrc2018), [webqa](https://spaces.ac.cn/archives/4338), and [laisi](https://www.kesci.com/home/competition/5d142d8cbb14e6002c04e14a/content/0). We only use train set of the three datasets.
|
31 |
|
32 |
## Training procedure
|
33 |
|
34 |
The model is fine-tuned by [UER-py](https://github.com/dbiir/UER-py/) on [Tencent Cloud TI-ONE](https://cloud.tencent.com/product/tione/). We fine-tune three epochs with a sequence length of 512 on the basis of the pre-trained model [chinese_roberta_L-12_H-768](https://huggingface.co/uer/chinese_roberta_L-12_H-768).
|
35 |
|
36 |
```
|
37 |
+
python3 run_cmrc.py --pretrained_model_path models/cluecorpussmall_roberta_base_seq512_model.bin-250000 \
|
|
|
38 |
--vocab_path models/google_zh_vocab.txt \
|
39 |
--train_path extractive_qa.json \
|
40 |
--dev_path datasets/cmrc2018/dev.json \
|
41 |
--output_model_path models/extractive_qa_model.bin \
|
42 |
+
--learning_rate 3e-5 --batch_size 32 --epochs_num 3 --seq_length 512 \
|
43 |
--embedding word_pos_seg --encoder transformer --mask fully_visible
|
44 |
```
|
45 |
|