zh_data_dev_spacy_trf_1
Chinese spacy model, based on the spacy stock zh_core_web_trf transformer-based model, used for regular day to day data engineering.
Chinese transformer pipeline (Transformer(name='bert-base-chinese', piece_encoder='bert-wordpiece', stride=152, type='bert', width=768, window=208, vocab_size=21128)). Components: transformer, tagger, parser, ner, attribute_ruler.
Feature | Description |
---|---|
Name | zh_core_web_trf |
Version | 3.7.2 |
spaCy | >=3.7.0,<3.8.0 |
Default Pipeline | transformer , tagger , parser , attribute_ruler , ner |
Components | transformer , tagger , parser , attribute_ruler , ner |
Vectors | 0 keys, 0 unique vectors (0 dimensions) |
Sources | OntoNotes 5 (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston) CoreNLP Universal Dependencies Converter (Stanford NLP Group) bert-base-chinese (Hugging Face) |
License | MIT |
Author | Explosion |
Label Scheme
View label scheme (99 labels for 3 components)
Component | Labels |
---|---|
tagger |
AD , AS , BA , CC , CD , CS , DEC , DEG , DER , DEV , DT , ETC , FW , IJ , INF , JJ , LB , LC , M , MSP , NN , NR , NT , OD , ON , P , PN , PU , SB , SP , URL , VA , VC , VE , VV , X |
parser |
ROOT , acl , advcl:loc , advmod , advmod:dvp , advmod:loc , advmod:rcomp , amod , amod:ordmod , appos , aux:asp , aux:ba , aux:modal , aux:prtmod , auxpass , case , cc , ccomp , compound:nn , compound:vc , conj , cop , dep , det , discourse , dobj , etc , mark , mark:clf , name , neg , nmod , nmod:assmod , nmod:poss , nmod:prep , nmod:range , nmod:tmod , nmod:topic , nsubj , nsubj:xsubj , nsubjpass , nummod , parataxis:prnmod , punct , xcomp |
ner |
CARDINAL , DATE , EVENT , FAC , GPE , LANGUAGE , LAW , LOC , MONEY , NORP , ORDINAL , ORG , PERCENT , PERSON , PRODUCT , QUANTITY , TIME , WORK_OF_ART |
Accuracy
Type | Score |
---|---|
TOKEN_ACC |
95.85 |
TOKEN_P |
94.58 |
TOKEN_R |
91.36 |
TOKEN_F |
92.94 |
TAG_ACC |
91.75 |
SENTS_P |
70.92 |
SENTS_R |
67.57 |
SENTS_F |
69.21 |
DEP_UAS |
75.72 |
DEP_LAS |
71.45 |
ENTS_P |
76.09 |
ENTS_R |
72.18 |
ENTS_F |
74.08 |
- Downloads last month
- 9
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Evaluation results
- NER Precisionself-reported0.761
- NER Recallself-reported0.722
- NER F Scoreself-reported0.741
- TAG (XPOS) Accuracyself-reported0.918
- Unlabeled Attachment Score (UAS)self-reported0.757
- Labeled Attachment Score (LAS)self-reported0.715
- Sentences F-Scoreself-reported0.692