uer
/

roberta-base-finetuned-cluener2020-chinese

Token Classification

Inference Endpoints

Model card Files Files and versions Community

uer commited on Apr 28, 2021

Commit

13951c7

•

1 Parent(s): 31f953a

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -32,7 +32,7 @@ You can use this model directly with a pipeline for token classification :
 ## Training data
-[OCNLI](https://github.com/CLUEbenchmark/OCNLI) is used as training data. We only use the train set of the dataset.
 ## Training procedure
@@ -44,7 +44,7 @@ python3 run_ner.py --pretrained_model_path models/cluecorpussmall_roberta_base_s
                    --train_path datasets/cluener2020/train.tsv \
                    --dev_path datasets/cluener2020/dev.tsv \
                    --label2id_path datasets/cluener2020/label2id.json \
-                   --output_model_path models/cluener2020_classifier_model.bin \
                    --learning_rate 3e-5 --batch_size 32 --epochs_num 5 --seq_length 512 \
                    --embedding word_pos_seg --encoder transformer --mask fully_visible
 ```
@@ -52,7 +52,7 @@ python3 run_ner.py --pretrained_model_path models/cluecorpussmall_roberta_base_s
 Finally, we convert the pre-trained model into Huggingface's format:
 ```
-python3 scripts/convert_bert_token_classification_from_uer_to_huggingface.py --input_model_path models/cluener2020_classifier_model.bin \
                                                                              --output_model_path pytorch_model.bin \
                                                                              --layers_num 12
 ```

 ## Training data
+[CLUENER2020](https://github.com/CLUEbenchmark/CLUENER2020) is used as training data. We only use the train set of the dataset.
 ## Training procedure
                    --train_path datasets/cluener2020/train.tsv \
                    --dev_path datasets/cluener2020/dev.tsv \
                    --label2id_path datasets/cluener2020/label2id.json \
+                   --output_model_path models/cluener2020_ner_model.bin \
                    --learning_rate 3e-5 --batch_size 32 --epochs_num 5 --seq_length 512 \
                    --embedding word_pos_seg --encoder transformer --mask fully_visible
 ```
 Finally, we convert the pre-trained model into Huggingface's format:
 ```
+python3 scripts/convert_bert_token_classification_from_uer_to_huggingface.py --input_model_path models/cluener2020_ner_model.bin \
                                                                              --output_model_path pytorch_model.bin \
                                                                              --layers_num 12
 ```