Adriane Boyd commited on
Commit
871a9a9
·
1 Parent(s): bc843e6

Update spaCy pipeline

Browse files
README.md CHANGED
@@ -14,13 +14,13 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.6996904025
18
  - name: NER Recall
19
  type: recall
20
- value: 0.5685534591
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.6273421235
24
  - task:
25
  name: TAG
26
  type: token-classification
@@ -34,7 +34,7 @@ model-index:
34
  metrics:
35
  - name: POS (UPOS) Accuracy
36
  type: accuracy
37
- value: 0.9616721888
38
  - task:
39
  name: MORPH
40
  type: token-classification
@@ -48,28 +48,28 @@ model-index:
48
  metrics:
49
  - name: Lemma Accuracy
50
  type: accuracy
51
- value: 0.965013864
52
  - task:
53
  name: UNLABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
  - name: Unlabeled Attachment Score (UAS)
57
  type: f_score
58
- value: 0.9207149611
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
  - name: Labeled Attachment Score (LAS)
64
  type: f_score
65
- value: 0.9061220818
66
  - task:
67
  name: SENTS
68
  type: token-classification
69
  metrics:
70
  - name: Sentences F-Score
71
  type: f_score
72
- value: 0.9774288518
73
  ---
74
  ### Details: https://spacy.io/models/ja#ja_core_news_sm
75
 
@@ -78,8 +78,8 @@ Japanese pipeline optimized for CPU. Components: tok2vec, morphologizer, parser,
78
  | Feature | Description |
79
  | --- | --- |
80
  | **Name** | `ja_core_news_sm` |
81
- | **Version** | `3.3.0` |
82
- | **spaCy** | `>=3.3.0.dev0,<3.4.0` |
83
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `ner` |
84
  | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `ner` |
85
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
@@ -91,11 +91,11 @@ Japanese pipeline optimized for CPU. Components: tok2vec, morphologizer, parser,
91
 
92
  <details>
93
 
94
- <summary>View label scheme (64 labels for 3 components)</summary>
95
 
96
  | Component | Labels |
97
  | --- | --- |
98
- | **`morphologizer`** | `POS=NOUN`, `POS=ADP`, `POS=VERB`, `POS=SCONJ`, `POS=AUX`, `POS=PUNCT`, `POS=PART`, `POS=DET`, `POS=NUM`, `POS=ADV`, `POS=PRON`, `POS=ADJ`, `POS=PROPN`, `POS=CCONJ`, `POS=SYM`, `POS=NOUN\|Polarity=Neg`, `POS=AUX\|Polarity=Neg`, `POS=INTJ`, `POS=SCONJ\|Polarity=Neg` |
99
  | **`parser`** | `ROOT`, `acl`, `advcl`, `advmod`, `amod`, `aux`, `case`, `cc`, `ccomp`, `compound`, `cop`, `csubj`, `dep`, `det`, `dislocated`, `fixed`, `mark`, `nmod`, `nsubj`, `nummod`, `obj`, `obl`, `punct` |
100
  | **`ner`** | `CARDINAL`, `DATE`, `EVENT`, `FAC`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `MOVEMENT`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PET_NAME`, `PHONE`, `PRODUCT`, `QUANTITY`, `TIME`, `TITLE_AFFIX`, `WORK_OF_ART` |
101
 
@@ -109,18 +109,18 @@ Japanese pipeline optimized for CPU. Components: tok2vec, morphologizer, parser,
109
  | `TOKEN_P` | 97.65 |
110
  | `TOKEN_R` | 97.90 |
111
  | `TOKEN_F` | 97.77 |
112
- | `POS_ACC` | 96.17 |
113
  | `MORPH_ACC` | 0.00 |
114
  | `MORPH_MICRO_P` | 34.01 |
115
  | `MORPH_MICRO_R` | 98.04 |
116
  | `MORPH_MICRO_F` | 50.51 |
117
- | `SENTS_P` | 97.27 |
118
  | `SENTS_R` | 98.22 |
119
- | `SENTS_F` | 97.74 |
120
- | `DEP_UAS` | 92.07 |
121
  | `DEP_LAS` | 90.61 |
122
  | `TAG_ACC` | 97.12 |
123
- | `LEMMA_ACC` | 96.50 |
124
- | `ENTS_P` | 69.97 |
125
- | `ENTS_R` | 56.86 |
126
- | `ENTS_F` | 62.73 |
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.6792168675
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.5672955975
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.6182316655
24
  - task:
25
  name: TAG
26
  type: token-classification
 
34
  metrics:
35
  - name: POS (UPOS) Accuracy
36
  type: accuracy
37
+ value: 0.9615085696
38
  - task:
39
  name: MORPH
40
  type: token-classification
 
48
  metrics:
49
  - name: Lemma Accuracy
50
  type: accuracy
51
+ value: 0.9670526831
52
  - task:
53
  name: UNLABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
  - name: Unlabeled Attachment Score (UAS)
57
  type: f_score
58
+ value: 0.9200288184
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
  - name: Labeled Attachment Score (LAS)
64
  type: f_score
65
+ value: 0.9060559705
66
  - task:
67
  name: SENTS
68
  type: token-classification
69
  metrics:
70
  - name: Sentences F-Score
71
  type: f_score
72
+ value: 0.9745596869
73
  ---
74
  ### Details: https://spacy.io/models/ja#ja_core_news_sm
75
 
 
78
  | Feature | Description |
79
  | --- | --- |
80
  | **Name** | `ja_core_news_sm` |
81
+ | **Version** | `3.4.0` |
82
+ | **spaCy** | `>=3.4.0,<3.5.0` |
83
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `ner` |
84
  | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `ner` |
85
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
 
91
 
92
  <details>
93
 
94
+ <summary>View label scheme (65 labels for 3 components)</summary>
95
 
96
  | Component | Labels |
97
  | --- | --- |
98
+ | **`morphologizer`** | `POS=NOUN`, `POS=ADP`, `POS=VERB`, `POS=SCONJ`, `POS=AUX`, `POS=PUNCT`, `POS=PART`, `POS=DET`, `POS=NUM`, `POS=ADV`, `POS=PRON`, `POS=ADJ`, `POS=PROPN`, `POS=CCONJ`, `POS=SYM`, `POS=NOUN\|Polarity=Neg`, `POS=AUX\|Polarity=Neg`, `POS=SPACE`, `POS=INTJ`, `POS=SCONJ\|Polarity=Neg` |
99
  | **`parser`** | `ROOT`, `acl`, `advcl`, `advmod`, `amod`, `aux`, `case`, `cc`, `ccomp`, `compound`, `cop`, `csubj`, `dep`, `det`, `dislocated`, `fixed`, `mark`, `nmod`, `nsubj`, `nummod`, `obj`, `obl`, `punct` |
100
  | **`ner`** | `CARDINAL`, `DATE`, `EVENT`, `FAC`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `MOVEMENT`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PET_NAME`, `PHONE`, `PRODUCT`, `QUANTITY`, `TIME`, `TITLE_AFFIX`, `WORK_OF_ART` |
101
 
 
109
  | `TOKEN_P` | 97.65 |
110
  | `TOKEN_R` | 97.90 |
111
  | `TOKEN_F` | 97.77 |
112
+ | `POS_ACC` | 96.15 |
113
  | `MORPH_ACC` | 0.00 |
114
  | `MORPH_MICRO_P` | 34.01 |
115
  | `MORPH_MICRO_R` | 98.04 |
116
  | `MORPH_MICRO_F` | 50.51 |
117
+ | `SENTS_P` | 96.70 |
118
  | `SENTS_R` | 98.22 |
119
+ | `SENTS_F` | 97.46 |
120
+ | `DEP_UAS` | 92.00 |
121
  | `DEP_LAS` | 90.61 |
122
  | `TAG_ACC` | 97.12 |
123
+ | `LEMMA_ACC` | 96.71 |
124
+ | `ENTS_P` | 67.92 |
125
+ | `ENTS_R` | 56.73 |
126
+ | `ENTS_F` | 61.82 |
accuracy.json CHANGED
@@ -3,7 +3,7 @@
3
  "token_p": 0.9764591282,
4
  "token_r": 0.9790021974,
5
  "token_f": 0.9777290092,
6
- "pos_acc": 0.9616721888,
7
  "morph_acc": 0.0,
8
  "morph_micro_p": 0.3401360544,
9
  "morph_micro_r": 0.9803921569,
@@ -25,36 +25,36 @@
25
  "f": 0.0
26
  }
27
  },
28
- "sents_p": 0.97265625,
29
  "sents_r": 0.9822485207,
30
- "sents_f": 0.9774288518,
31
- "dep_uas": 0.9207149611,
32
- "dep_las": 0.9061220818,
33
  "dep_las_per_type": {
34
  "cc": {
35
- "p": 0.8260869565,
36
- "r": 0.7916666667,
37
- "f": 0.8085106383
38
  },
39
  "compound": {
40
- "p": 0.9384079024,
41
- "r": 0.9103720406,
42
- "f": 0.9241773963
43
  },
44
  "obl": {
45
- "p": 0.813283208,
46
- "r": 0.8102372035,
47
- "f": 0.8117573483
48
  },
49
  "case": {
50
- "p": 0.9881226054,
51
- "r": 0.9798632219,
52
- "f": 0.9839755818
53
  },
54
  "dislocated": {
55
- "p": 0.5,
56
  "r": 0.3846153846,
57
- "f": 0.4347826087
58
  },
59
  "nsubj": {
60
  "p": 0.8188824663,
@@ -62,171 +62,171 @@
62
  "f": 0.8173076923
63
  },
64
  "nmod": {
65
- "p": 0.8879093199,
66
- "r": 0.8245614035,
67
- "f": 0.855063675
68
  },
69
  "root": {
70
- "p": 0.9643564356,
71
- "r": 0.9605522682,
72
- "f": 0.9624505929
73
  },
74
  "aux": {
75
- "p": 0.9788213628,
76
- "r": 0.9870009285,
77
- "f": 0.9828941285
78
  },
79
  "advcl": {
80
- "p": 0.6824324324,
81
- "r": 0.6808988764,
82
- "f": 0.6816647919
83
  },
84
  "mark": {
85
- "p": 0.9696969697,
86
- "r": 0.96,
87
- "f": 0.9648241206
88
  },
89
  "fixed": {
90
- "p": 0.963898917,
91
- "r": 0.9709090909,
92
- "f": 0.9673913043
93
  },
94
  "acl": {
95
- "p": 0.8252212389,
96
- "r": 0.8197802198,
97
- "f": 0.822491731
98
  },
99
  "obj": {
100
- "p": 0.9446153846,
101
- "r": 0.9274924471,
102
- "f": 0.9359756098
103
  },
104
  "nummod": {
105
- "p": 0.9805194805,
106
  "r": 0.8934911243,
107
- "f": 0.9349845201
108
  },
109
  "advmod": {
110
- "p": 0.6788321168,
111
- "r": 0.6642857143,
112
- "f": 0.6714801444
113
  },
114
  "amod": {
115
- "p": 0.8125,
116
  "r": 0.7027027027,
117
- "f": 0.7536231884
118
  },
119
  "cop": {
120
- "p": 0.9756097561,
121
- "r": 0.9302325581,
122
- "f": 0.9523809524
123
  },
124
  "ccomp": {
125
- "p": 1.0,
126
- "r": 0.8636363636,
127
- "f": 0.9268292683
128
  },
129
- "dep": {
130
- "p": 0.0,
131
- "r": 0.0,
132
- "f": 0.0
133
  },
134
  "csubj": {
135
- "p": 0.5333333333,
136
  "r": 0.6666666667,
137
- "f": 0.5925925926
138
  },
139
- "det": {
140
- "p": 1.0,
141
- "r": 0.9811320755,
142
- "f": 0.9904761905
143
  }
144
  },
145
  "tag_acc": 0.9712488769,
146
- "lemma_acc": 0.965013864,
147
- "ents_p": 0.6996904025,
148
- "ents_r": 0.5685534591,
149
- "ents_f": 0.6273421235,
150
  "ents_per_type": {
151
  "DATE": {
152
- "p": 0.9454545455,
153
- "r": 0.9541284404,
154
- "f": 0.9497716895
155
- },
156
- "PRODUCT": {
157
- "p": 0.4814814815,
158
- "r": 0.3095238095,
159
- "f": 0.3768115942
160
  },
161
- "ORG": {
162
- "p": 0.5148514851,
163
- "r": 0.3795620438,
164
- "f": 0.4369747899
165
- },
166
- "QUANTITY": {
167
- "p": 0.8243243243,
168
- "r": 0.9242424242,
169
- "f": 0.8714285714
170
  },
171
  "GPE": {
172
- "p": 0.6179775281,
173
- "r": 0.585106383,
174
- "f": 0.6010928962
 
 
 
 
 
175
  },
176
  "TIME": {
177
  "p": 0.6666666667,
178
  "r": 1.0,
179
  "f": 0.8
180
  },
181
- "PERSON": {
182
- "p": 0.6632653061,
183
- "r": 0.4676258993,
184
- "f": 0.5485232068
185
  },
186
  "NORP": {
187
- "p": 0.6666666667,
188
- "r": 0.5625,
189
- "f": 0.6101694915
190
  },
191
- "ORDINAL": {
192
- "p": 0.5185185185,
193
- "r": 0.6363636364,
194
  "f": 0.5714285714
195
  },
196
- "TITLE_AFFIX": {
197
- "p": 0.6875,
198
- "r": 0.3666666667,
199
- "f": 0.4782608696
200
  },
201
- "FAC": {
202
- "p": 0.5882352941,
203
- "r": 0.2702702703,
204
- "f": 0.3703703704
205
  },
206
  "WORK_OF_ART": {
207
- "p": 0.9090909091,
208
  "r": 0.5882352941,
209
- "f": 0.7142857143
 
 
 
 
 
 
 
 
 
 
210
  },
211
  "PERCENT": {
212
  "p": 0.6666666667,
213
  "r": 0.2857142857,
214
  "f": 0.4
215
  },
216
- "EVENT": {
217
- "p": 0.8125,
218
- "r": 0.5,
219
- "f": 0.619047619
220
- },
221
- "CARDINAL": {
222
- "p": 0.0,
223
- "r": 0.0,
224
- "f": 0.0
225
  },
226
  "LOC": {
227
- "p": 0.8571428571,
228
- "r": 0.6,
229
- "f": 0.7058823529
230
  },
231
  "MOVEMENT": {
232
  "p": 0.0,
@@ -234,20 +234,20 @@
234
  "f": 0.0
235
  },
236
  "LAW": {
237
- "p": 1.0,
238
- "r": 0.3333333333,
239
- "f": 0.5
240
  },
241
  "MONEY": {
242
- "p": 0.875,
243
  "r": 1.0,
244
- "f": 0.9333333333
245
  },
246
  "LANGUAGE": {
247
- "p": 1.0,
248
  "r": 1.0,
249
- "f": 1.0
250
  }
251
  },
252
- "speed": 10590.4387625828
253
  }
 
3
  "token_p": 0.9764591282,
4
  "token_r": 0.9790021974,
5
  "token_f": 0.9777290092,
6
+ "pos_acc": 0.9615085696,
7
  "morph_acc": 0.0,
8
  "morph_micro_p": 0.3401360544,
9
  "morph_micro_r": 0.9803921569,
 
25
  "f": 0.0
26
  }
27
  },
28
+ "sents_p": 0.9669902913,
29
  "sents_r": 0.9822485207,
30
+ "sents_f": 0.9745596869,
31
+ "dep_uas": 0.9200288184,
32
+ "dep_las": 0.9060559705,
33
  "dep_las_per_type": {
34
  "cc": {
35
+ "p": 0.7959183673,
36
+ "r": 0.8125,
37
+ "f": 0.8041237113
38
  },
39
  "compound": {
40
+ "p": 0.9388824214,
41
+ "r": 0.9092446449,
42
+ "f": 0.9238258877
43
  },
44
  "obl": {
45
+ "p": 0.8032581454,
46
+ "r": 0.8002496879,
47
+ "f": 0.8017510944
48
  },
49
  "case": {
50
+ "p": 0.9888718342,
51
+ "r": 0.9791033435,
52
+ "f": 0.9839633448
53
  },
54
  "dislocated": {
55
+ "p": 0.625,
56
  "r": 0.3846153846,
57
+ "f": 0.4761904762
58
  },
59
  "nsubj": {
60
  "p": 0.8188824663,
 
62
  "f": 0.8173076923
63
  },
64
  "nmod": {
65
+ "p": 0.8813349815,
66
+ "r": 0.8339181287,
67
+ "f": 0.8569711538
68
  },
69
  "root": {
70
+ "p": 0.9625984252,
71
+ "r": 0.9644970414,
72
+ "f": 0.963546798
73
  },
74
  "aux": {
75
+ "p": 0.9760147601,
76
+ "r": 0.982358403,
77
+ "f": 0.9791763073
78
  },
79
  "advcl": {
80
+ "p": 0.6756756757,
81
+ "r": 0.6741573034,
82
+ "f": 0.6749156355
83
  },
84
  "mark": {
85
+ "p": 0.9754601227,
86
+ "r": 0.954,
87
+ "f": 0.9646107179
88
  },
89
  "fixed": {
90
+ "p": 0.9572192513,
91
+ "r": 0.9763636364,
92
+ "f": 0.9666966697
93
  },
94
  "acl": {
95
+ "p": 0.8296943231,
96
+ "r": 0.8351648352,
97
+ "f": 0.8324205915
98
  },
99
  "obj": {
100
+ "p": 0.9480122324,
101
+ "r": 0.9365558912,
102
+ "f": 0.9422492401
103
  },
104
  "nummod": {
105
+ "p": 0.9869281046,
106
  "r": 0.8934911243,
107
+ "f": 0.9378881988
108
  },
109
  "advmod": {
110
+ "p": 0.6923076923,
111
+ "r": 0.6428571429,
112
+ "f": 0.6666666667
113
  },
114
  "amod": {
115
+ "p": 0.8965517241,
116
  "r": 0.7027027027,
117
+ "f": 0.7878787879
118
  },
119
  "cop": {
120
+ "p": 0.9640718563,
121
+ "r": 0.9360465116,
122
+ "f": 0.9498525074
123
  },
124
  "ccomp": {
125
+ "p": 0.9473684211,
126
+ "r": 0.8181818182,
127
+ "f": 0.8780487805
128
  },
129
+ "det": {
130
+ "p": 0.9807692308,
131
+ "r": 0.9622641509,
132
+ "f": 0.9714285714
133
  },
134
  "csubj": {
135
+ "p": 0.8,
136
  "r": 0.6666666667,
137
+ "f": 0.7272727273
138
  },
139
+ "dep": {
140
+ "p": 0.0,
141
+ "r": 0.0,
142
+ "f": 0.0
143
  }
144
  },
145
  "tag_acc": 0.9712488769,
146
+ "lemma_acc": 0.9670526831,
147
+ "ents_p": 0.6792168675,
148
+ "ents_r": 0.5672955975,
149
+ "ents_f": 0.6182316655,
150
  "ents_per_type": {
151
  "DATE": {
152
+ "p": 0.9449541284,
153
+ "r": 0.9449541284,
154
+ "f": 0.9449541284
 
 
 
 
 
155
  },
156
+ "PERSON": {
157
+ "p": 0.5384615385,
158
+ "r": 0.4532374101,
159
+ "f": 0.4921875
 
 
 
 
 
160
  },
161
  "GPE": {
162
+ "p": 0.6578947368,
163
+ "r": 0.5319148936,
164
+ "f": 0.5882352941
165
+ },
166
+ "PRODUCT": {
167
+ "p": 0.48,
168
+ "r": 0.2857142857,
169
+ "f": 0.3582089552
170
  },
171
  "TIME": {
172
  "p": 0.6666666667,
173
  "r": 1.0,
174
  "f": 0.8
175
  },
176
+ "QUANTITY": {
177
+ "p": 0.8676470588,
178
+ "r": 0.8939393939,
179
+ "f": 0.8805970149
180
  },
181
  "NORP": {
182
+ "p": 0.7916666667,
183
+ "r": 0.59375,
184
+ "f": 0.6785714286
185
  },
186
+ "TITLE_AFFIX": {
187
+ "p": 0.7368421053,
188
+ "r": 0.4666666667,
189
  "f": 0.5714285714
190
  },
191
+ "ORG": {
192
+ "p": 0.5,
193
+ "r": 0.4160583942,
194
+ "f": 0.4541832669
195
  },
196
+ "ORDINAL": {
197
+ "p": 0.5,
198
+ "r": 0.5909090909,
199
+ "f": 0.5416666667
200
  },
201
  "WORK_OF_ART": {
202
+ "p": 0.6666666667,
203
  "r": 0.5882352941,
204
+ "f": 0.625
205
+ },
206
+ "CARDINAL": {
207
+ "p": 1.0,
208
+ "r": 0.5,
209
+ "f": 0.6666666667
210
+ },
211
+ "EVENT": {
212
+ "p": 0.8,
213
+ "r": 0.4615384615,
214
+ "f": 0.5853658537
215
  },
216
  "PERCENT": {
217
  "p": 0.6666666667,
218
  "r": 0.2857142857,
219
  "f": 0.4
220
  },
221
+ "FAC": {
222
+ "p": 0.6666666667,
223
+ "r": 0.3243243243,
224
+ "f": 0.4363636364
 
 
 
 
 
225
  },
226
  "LOC": {
227
+ "p": 0.5833333333,
228
+ "r": 0.7,
229
+ "f": 0.6363636364
230
  },
231
  "MOVEMENT": {
232
  "p": 0.0,
 
234
  "f": 0.0
235
  },
236
  "LAW": {
237
+ "p": 0.0,
238
+ "r": 0.0,
239
+ "f": 0.0
240
  },
241
  "MONEY": {
242
+ "p": 1.0,
243
  "r": 1.0,
244
+ "f": 1.0
245
  },
246
  "LANGUAGE": {
247
+ "p": 0.8571428571,
248
  "r": 1.0,
249
+ "f": 0.9230769231
250
  }
251
  },
252
+ "speed": 11044.5862640641
253
  }
ja_core_news_sm-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:abe7de8a5653288f79100243b208be37fc0ade5cd07e6bfdfc8b784424cfcf06
3
- size 11965420
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cbc125fb8e0c4fa41923294eedec9f16792e96c8911e9c086f55443f84bc822e
3
+ size 11973416
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"ja",
3
  "name":"core_news_sm",
4
- "version":"3.3.0",
5
  "description":"Japanese pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
- "spacy_version":">=3.3.0.dev0,<3.4.0",
11
- "spacy_git_version":"849bef2de",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
@@ -37,6 +37,7 @@
37
  "POS=SYM",
38
  "POS=NOUN|Polarity=Neg",
39
  "POS=AUX|Polarity=Neg",
 
40
  "POS=INTJ",
41
  "POS=SCONJ|Polarity=Neg"
42
  ],
@@ -116,7 +117,7 @@
116
  "token_p":0.9764591282,
117
  "token_r":0.9790021974,
118
  "token_f":0.9777290092,
119
- "pos_acc":0.9616721888,
120
  "morph_acc":0.0,
121
  "morph_micro_p":0.3401360544,
122
  "morph_micro_r":0.9803921569,
@@ -138,36 +139,36 @@
138
  "f":0.0
139
  }
140
  },
141
- "sents_p":0.97265625,
142
  "sents_r":0.9822485207,
143
- "sents_f":0.9774288518,
144
- "dep_uas":0.9207149611,
145
- "dep_las":0.9061220818,
146
  "dep_las_per_type":{
147
  "cc":{
148
- "p":0.8260869565,
149
- "r":0.7916666667,
150
- "f":0.8085106383
151
  },
152
  "compound":{
153
- "p":0.9384079024,
154
- "r":0.9103720406,
155
- "f":0.9241773963
156
  },
157
  "obl":{
158
- "p":0.813283208,
159
- "r":0.8102372035,
160
- "f":0.8117573483
161
  },
162
  "case":{
163
- "p":0.9881226054,
164
- "r":0.9798632219,
165
- "f":0.9839755818
166
  },
167
  "dislocated":{
168
- "p":0.5,
169
  "r":0.3846153846,
170
- "f":0.4347826087
171
  },
172
  "nsubj":{
173
  "p":0.8188824663,
@@ -175,171 +176,171 @@
175
  "f":0.8173076923
176
  },
177
  "nmod":{
178
- "p":0.8879093199,
179
- "r":0.8245614035,
180
- "f":0.855063675
181
  },
182
  "root":{
183
- "p":0.9643564356,
184
- "r":0.9605522682,
185
- "f":0.9624505929
186
  },
187
  "aux":{
188
- "p":0.9788213628,
189
- "r":0.9870009285,
190
- "f":0.9828941285
191
  },
192
  "advcl":{
193
- "p":0.6824324324,
194
- "r":0.6808988764,
195
- "f":0.6816647919
196
  },
197
  "mark":{
198
- "p":0.9696969697,
199
- "r":0.96,
200
- "f":0.9648241206
201
  },
202
  "fixed":{
203
- "p":0.963898917,
204
- "r":0.9709090909,
205
- "f":0.9673913043
206
  },
207
  "acl":{
208
- "p":0.8252212389,
209
- "r":0.8197802198,
210
- "f":0.822491731
211
  },
212
  "obj":{
213
- "p":0.9446153846,
214
- "r":0.9274924471,
215
- "f":0.9359756098
216
  },
217
  "nummod":{
218
- "p":0.9805194805,
219
  "r":0.8934911243,
220
- "f":0.9349845201
221
  },
222
  "advmod":{
223
- "p":0.6788321168,
224
- "r":0.6642857143,
225
- "f":0.6714801444
226
  },
227
  "amod":{
228
- "p":0.8125,
229
  "r":0.7027027027,
230
- "f":0.7536231884
231
  },
232
  "cop":{
233
- "p":0.9756097561,
234
- "r":0.9302325581,
235
- "f":0.9523809524
236
  },
237
  "ccomp":{
238
- "p":1.0,
239
- "r":0.8636363636,
240
- "f":0.9268292683
241
  },
242
- "dep":{
243
- "p":0.0,
244
- "r":0.0,
245
- "f":0.0
246
  },
247
  "csubj":{
248
- "p":0.5333333333,
249
  "r":0.6666666667,
250
- "f":0.5925925926
251
  },
252
- "det":{
253
- "p":1.0,
254
- "r":0.9811320755,
255
- "f":0.9904761905
256
  }
257
  },
258
  "tag_acc":0.9712488769,
259
- "lemma_acc":0.965013864,
260
- "ents_p":0.6996904025,
261
- "ents_r":0.5685534591,
262
- "ents_f":0.6273421235,
263
  "ents_per_type":{
264
  "DATE":{
265
- "p":0.9454545455,
266
- "r":0.9541284404,
267
- "f":0.9497716895
268
- },
269
- "PRODUCT":{
270
- "p":0.4814814815,
271
- "r":0.3095238095,
272
- "f":0.3768115942
273
  },
274
- "ORG":{
275
- "p":0.5148514851,
276
- "r":0.3795620438,
277
- "f":0.4369747899
278
- },
279
- "QUANTITY":{
280
- "p":0.8243243243,
281
- "r":0.9242424242,
282
- "f":0.8714285714
283
  },
284
  "GPE":{
285
- "p":0.6179775281,
286
- "r":0.585106383,
287
- "f":0.6010928962
 
 
 
 
 
288
  },
289
  "TIME":{
290
  "p":0.6666666667,
291
  "r":1.0,
292
  "f":0.8
293
  },
294
- "PERSON":{
295
- "p":0.6632653061,
296
- "r":0.4676258993,
297
- "f":0.5485232068
298
  },
299
  "NORP":{
300
- "p":0.6666666667,
301
- "r":0.5625,
302
- "f":0.6101694915
303
  },
304
- "ORDINAL":{
305
- "p":0.5185185185,
306
- "r":0.6363636364,
307
  "f":0.5714285714
308
  },
309
- "TITLE_AFFIX":{
310
- "p":0.6875,
311
- "r":0.3666666667,
312
- "f":0.4782608696
313
  },
314
- "FAC":{
315
- "p":0.5882352941,
316
- "r":0.2702702703,
317
- "f":0.3703703704
318
  },
319
  "WORK_OF_ART":{
320
- "p":0.9090909091,
321
  "r":0.5882352941,
322
- "f":0.7142857143
 
 
 
 
 
 
 
 
 
 
323
  },
324
  "PERCENT":{
325
  "p":0.6666666667,
326
  "r":0.2857142857,
327
  "f":0.4
328
  },
329
- "EVENT":{
330
- "p":0.8125,
331
- "r":0.5,
332
- "f":0.619047619
333
- },
334
- "CARDINAL":{
335
- "p":0.0,
336
- "r":0.0,
337
- "f":0.0
338
  },
339
  "LOC":{
340
- "p":0.8571428571,
341
- "r":0.6,
342
- "f":0.7058823529
343
  },
344
  "MOVEMENT":{
345
  "p":0.0,
@@ -347,22 +348,22 @@
347
  "f":0.0
348
  },
349
  "LAW":{
350
- "p":1.0,
351
- "r":0.3333333333,
352
- "f":0.5
353
  },
354
  "MONEY":{
355
- "p":0.875,
356
  "r":1.0,
357
- "f":0.9333333333
358
  },
359
  "LANGUAGE":{
360
- "p":1.0,
361
  "r":1.0,
362
- "f":1.0
363
  }
364
  },
365
- "speed":10590.4387625828
366
  },
367
  "sources":[
368
  {
 
1
  {
2
  "lang":"ja",
3
  "name":"core_news_sm",
4
+ "version":"3.4.0",
5
  "description":"Japanese pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
+ "spacy_version":">=3.4.0,<3.5.0",
11
+ "spacy_git_version":"dd038b536",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
 
37
  "POS=SYM",
38
  "POS=NOUN|Polarity=Neg",
39
  "POS=AUX|Polarity=Neg",
40
+ "POS=SPACE",
41
  "POS=INTJ",
42
  "POS=SCONJ|Polarity=Neg"
43
  ],
 
117
  "token_p":0.9764591282,
118
  "token_r":0.9790021974,
119
  "token_f":0.9777290092,
120
+ "pos_acc":0.9615085696,
121
  "morph_acc":0.0,
122
  "morph_micro_p":0.3401360544,
123
  "morph_micro_r":0.9803921569,
 
139
  "f":0.0
140
  }
141
  },
142
+ "sents_p":0.9669902913,
143
  "sents_r":0.9822485207,
144
+ "sents_f":0.9745596869,
145
+ "dep_uas":0.9200288184,
146
+ "dep_las":0.9060559705,
147
  "dep_las_per_type":{
148
  "cc":{
149
+ "p":0.7959183673,
150
+ "r":0.8125,
151
+ "f":0.8041237113
152
  },
153
  "compound":{
154
+ "p":0.9388824214,
155
+ "r":0.9092446449,
156
+ "f":0.9238258877
157
  },
158
  "obl":{
159
+ "p":0.8032581454,
160
+ "r":0.8002496879,
161
+ "f":0.8017510944
162
  },
163
  "case":{
164
+ "p":0.9888718342,
165
+ "r":0.9791033435,
166
+ "f":0.9839633448
167
  },
168
  "dislocated":{
169
+ "p":0.625,
170
  "r":0.3846153846,
171
+ "f":0.4761904762
172
  },
173
  "nsubj":{
174
  "p":0.8188824663,
 
176
  "f":0.8173076923
177
  },
178
  "nmod":{
179
+ "p":0.8813349815,
180
+ "r":0.8339181287,
181
+ "f":0.8569711538
182
  },
183
  "root":{
184
+ "p":0.9625984252,
185
+ "r":0.9644970414,
186
+ "f":0.963546798
187
  },
188
  "aux":{
189
+ "p":0.9760147601,
190
+ "r":0.982358403,
191
+ "f":0.9791763073
192
  },
193
  "advcl":{
194
+ "p":0.6756756757,
195
+ "r":0.6741573034,
196
+ "f":0.6749156355
197
  },
198
  "mark":{
199
+ "p":0.9754601227,
200
+ "r":0.954,
201
+ "f":0.9646107179
202
  },
203
  "fixed":{
204
+ "p":0.9572192513,
205
+ "r":0.9763636364,
206
+ "f":0.9666966697
207
  },
208
  "acl":{
209
+ "p":0.8296943231,
210
+ "r":0.8351648352,
211
+ "f":0.8324205915
212
  },
213
  "obj":{
214
+ "p":0.9480122324,
215
+ "r":0.9365558912,
216
+ "f":0.9422492401
217
  },
218
  "nummod":{
219
+ "p":0.9869281046,
220
  "r":0.8934911243,
221
+ "f":0.9378881988
222
  },
223
  "advmod":{
224
+ "p":0.6923076923,
225
+ "r":0.6428571429,
226
+ "f":0.6666666667
227
  },
228
  "amod":{
229
+ "p":0.8965517241,
230
  "r":0.7027027027,
231
+ "f":0.7878787879
232
  },
233
  "cop":{
234
+ "p":0.9640718563,
235
+ "r":0.9360465116,
236
+ "f":0.9498525074
237
  },
238
  "ccomp":{
239
+ "p":0.9473684211,
240
+ "r":0.8181818182,
241
+ "f":0.8780487805
242
  },
243
+ "det":{
244
+ "p":0.9807692308,
245
+ "r":0.9622641509,
246
+ "f":0.9714285714
247
  },
248
  "csubj":{
249
+ "p":0.8,
250
  "r":0.6666666667,
251
+ "f":0.7272727273
252
  },
253
+ "dep":{
254
+ "p":0.0,
255
+ "r":0.0,
256
+ "f":0.0
257
  }
258
  },
259
  "tag_acc":0.9712488769,
260
+ "lemma_acc":0.9670526831,
261
+ "ents_p":0.6792168675,
262
+ "ents_r":0.5672955975,
263
+ "ents_f":0.6182316655,
264
  "ents_per_type":{
265
  "DATE":{
266
+ "p":0.9449541284,
267
+ "r":0.9449541284,
268
+ "f":0.9449541284
 
 
 
 
 
269
  },
270
+ "PERSON":{
271
+ "p":0.5384615385,
272
+ "r":0.4532374101,
273
+ "f":0.4921875
 
 
 
 
 
274
  },
275
  "GPE":{
276
+ "p":0.6578947368,
277
+ "r":0.5319148936,
278
+ "f":0.5882352941
279
+ },
280
+ "PRODUCT":{
281
+ "p":0.48,
282
+ "r":0.2857142857,
283
+ "f":0.3582089552
284
  },
285
  "TIME":{
286
  "p":0.6666666667,
287
  "r":1.0,
288
  "f":0.8
289
  },
290
+ "QUANTITY":{
291
+ "p":0.8676470588,
292
+ "r":0.8939393939,
293
+ "f":0.8805970149
294
  },
295
  "NORP":{
296
+ "p":0.7916666667,
297
+ "r":0.59375,
298
+ "f":0.6785714286
299
  },
300
+ "TITLE_AFFIX":{
301
+ "p":0.7368421053,
302
+ "r":0.4666666667,
303
  "f":0.5714285714
304
  },
305
+ "ORG":{
306
+ "p":0.5,
307
+ "r":0.4160583942,
308
+ "f":0.4541832669
309
  },
310
+ "ORDINAL":{
311
+ "p":0.5,
312
+ "r":0.5909090909,
313
+ "f":0.5416666667
314
  },
315
  "WORK_OF_ART":{
316
+ "p":0.6666666667,
317
  "r":0.5882352941,
318
+ "f":0.625
319
+ },
320
+ "CARDINAL":{
321
+ "p":1.0,
322
+ "r":0.5,
323
+ "f":0.6666666667
324
+ },
325
+ "EVENT":{
326
+ "p":0.8,
327
+ "r":0.4615384615,
328
+ "f":0.5853658537
329
  },
330
  "PERCENT":{
331
  "p":0.6666666667,
332
  "r":0.2857142857,
333
  "f":0.4
334
  },
335
+ "FAC":{
336
+ "p":0.6666666667,
337
+ "r":0.3243243243,
338
+ "f":0.4363636364
 
 
 
 
 
339
  },
340
  "LOC":{
341
+ "p":0.5833333333,
342
+ "r":0.7,
343
+ "f":0.6363636364
344
  },
345
  "MOVEMENT":{
346
  "p":0.0,
 
348
  "f":0.0
349
  },
350
  "LAW":{
351
+ "p":0.0,
352
+ "r":0.0,
353
+ "f":0.0
354
  },
355
  "MONEY":{
356
+ "p":1.0,
357
  "r":1.0,
358
+ "f":1.0
359
  },
360
  "LANGUAGE":{
361
+ "p":0.8571428571,
362
  "r":1.0,
363
+ "f":0.9230769231
364
  }
365
  },
366
+ "speed":11044.5862640641
367
  },
368
  "sources":[
369
  {
morphologizer/cfg CHANGED
@@ -18,6 +18,7 @@
18
  "POS=SYM":"",
19
  "POS=NOUN|Polarity=Neg":"Polarity=Neg",
20
  "POS=AUX|Polarity=Neg":"Polarity=Neg",
 
21
  "POS=INTJ":"",
22
  "POS=SCONJ|Polarity=Neg":"Polarity=Neg"
23
  },
@@ -39,6 +40,7 @@
39
  "POS=SYM":99,
40
  "POS=NOUN|Polarity=Neg":92,
41
  "POS=AUX|Polarity=Neg":87,
 
42
  "POS=INTJ":91,
43
  "POS=SCONJ|Polarity=Neg":98
44
  },
 
18
  "POS=SYM":"",
19
  "POS=NOUN|Polarity=Neg":"Polarity=Neg",
20
  "POS=AUX|Polarity=Neg":"Polarity=Neg",
21
+ "POS=SPACE":"",
22
  "POS=INTJ":"",
23
  "POS=SCONJ|Polarity=Neg":"Polarity=Neg"
24
  },
 
40
  "POS=SYM":99,
41
  "POS=NOUN|Polarity=Neg":92,
42
  "POS=AUX|Polarity=Neg":87,
43
+ "POS=SPACE":103,
44
  "POS=INTJ":91,
45
  "POS=SCONJ|Polarity=Neg":98
46
  },
morphologizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:54cc5d13db9e81893cb14d34cce3c20877984a99ace9c89b8339861e1e5daba6
3
- size 7801
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3b06b5631b6158914b83210df4c9c3f2f97f4bdb911e380a3aa80276223f52b3
3
+ size 8189
ner/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:748916f006122c16ac8203ed42ff9a480c9efb7e7a533ab8d0ca7f21e1df8146
3
  size 6158761
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2451b5201fd7b812b16ad9aba82db399514afada6fb39703bef3b4a143bf9811
3
  size 6158761
ner/moves CHANGED
@@ -1 +1 @@
1
- ��moves��{"0":{},"1":{"DATE":4200,"ORG":3487,"PERSON":3042,"QUANTITY":2519,"GPE":1953,"PRODUCT":1328,"FAC":1243,"ORDINAL":1114,"WORK_OF_ART":1053,"EVENT":869,"NORP":734,"LOC":563,"MONEY":400,"TITLE_AFFIX":344,"TIME":300,"PERCENT":274,"MOVEMENT":148,"LAW":94,"LANGUAGE":82,"CARDINAL":27,"PET_NAME":20,"PHONE":4},"2":{"DATE":4200,"ORG":3487,"PERSON":3042,"QUANTITY":2519,"GPE":1953,"PRODUCT":1328,"FAC":1243,"ORDINAL":1114,"WORK_OF_ART":1053,"EVENT":869,"NORP":734,"LOC":563,"MONEY":400,"TITLE_AFFIX":344,"TIME":300,"PERCENT":274,"MOVEMENT":148,"LAW":94,"LANGUAGE":82,"CARDINAL":27,"PET_NAME":20,"PHONE":4},"3":{"DATE":4200,"ORG":3487,"PERSON":3042,"QUANTITY":2519,"GPE":1953,"PRODUCT":1328,"FAC":1243,"ORDINAL":1114,"WORK_OF_ART":1053,"EVENT":869,"NORP":734,"LOC":563,"MONEY":400,"TITLE_AFFIX":344,"TIME":300,"PERCENT":274,"MOVEMENT":148,"LAW":94,"LANGUAGE":82,"CARDINAL":27,"PET_NAME":20,"PHONE":4},"4":{"DATE":4200,"ORG":3487,"PERSON":3042,"QUANTITY":2519,"GPE":1953,"PRODUCT":1328,"FAC":1243,"ORDINAL":1114,"WORK_OF_ART":1053,"EVENT":869,"NORP":734,"LOC":563,"MONEY":400,"TITLE_AFFIX":344,"TIME":300,"PERCENT":274,"MOVEMENT":148,"LAW":94,"LANGUAGE":82,"CARDINAL":27,"PET_NAME":20,"PHONE":4,"":1},"5":{"":1}}�cfg��neg_key�
 
1
+ ��moves��{"0":{},"1":{"DATE":4200,"ORG":3488,"PERSON":3043,"QUANTITY":2521,"GPE":1953,"PRODUCT":1328,"FAC":1244,"ORDINAL":1114,"WORK_OF_ART":1053,"EVENT":869,"NORP":735,"LOC":563,"MONEY":400,"TITLE_AFFIX":344,"TIME":300,"PERCENT":274,"MOVEMENT":148,"LAW":94,"LANGUAGE":82,"CARDINAL":27,"PET_NAME":20,"PHONE":4},"2":{"DATE":4200,"ORG":3488,"PERSON":3043,"QUANTITY":2521,"GPE":1953,"PRODUCT":1328,"FAC":1244,"ORDINAL":1114,"WORK_OF_ART":1053,"EVENT":869,"NORP":735,"LOC":563,"MONEY":400,"TITLE_AFFIX":344,"TIME":300,"PERCENT":274,"MOVEMENT":148,"LAW":94,"LANGUAGE":82,"CARDINAL":27,"PET_NAME":20,"PHONE":4},"3":{"DATE":4200,"ORG":3488,"PERSON":3043,"QUANTITY":2521,"GPE":1953,"PRODUCT":1328,"FAC":1244,"ORDINAL":1114,"WORK_OF_ART":1053,"EVENT":869,"NORP":735,"LOC":563,"MONEY":400,"TITLE_AFFIX":344,"TIME":300,"PERCENT":274,"MOVEMENT":148,"LAW":94,"LANGUAGE":82,"CARDINAL":27,"PET_NAME":20,"PHONE":4},"4":{"DATE":4200,"ORG":3488,"PERSON":3043,"QUANTITY":2521,"GPE":1953,"PRODUCT":1328,"FAC":1244,"ORDINAL":1114,"WORK_OF_ART":1053,"EVENT":869,"NORP":735,"LOC":563,"MONEY":400,"TITLE_AFFIX":344,"TIME":300,"PERCENT":274,"MOVEMENT":148,"LAW":94,"LANGUAGE":82,"CARDINAL":27,"PET_NAME":20,"PHONE":4,"":1},"5":{"":1}}�cfg��neg_key�
parser/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:271dd83bf4634cda5578e228b20ae8ba0fcb62aa2cdf1a972435c4c9cb13591a
3
  size 299888
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:123648e466c6d2a1374fa5b6ababffcf4380f314c3a7d8e0f5c4772d697c082b
3
  size 299888
parser/moves CHANGED
@@ -1 +1 @@
1
- ��moves�~{"0":{"":77992},"1":{"":83293},"2":{"compound":23506,"nmod":11446,"obl":11030,"nsubj":6884,"advcl":6063,"acl":6020,"obj":4629,"nummod":2487,"advmod":1922,"punct":1321,"det":830,"cc":726,"amod":372,"ccomp":325,"dislocated":235,"csubj":133,"dep":0},"3":{"case":35913,"punct":15455,"aux":14940,"fixed":7391,"mark":6644,"cop":2100,"compound":598,"advcl":152,"dep":58},"4":{"ROOT":7050}}�cfg��neg_key�
 
1
+ ��moves�{"0":{"":77992},"1":{"":83431},"2":{"compound":23506,"nmod":11446,"obl":11030,"nsubj":6884,"advcl":6063,"acl":6020,"obj":4629,"nummod":2487,"advmod":1922,"punct":1321,"det":830,"cc":726,"amod":372,"ccomp":325,"dislocated":235,"csubj":133,"dep":0},"3":{"case":35913,"punct":15455,"aux":14940,"fixed":7391,"mark":6644,"cop":2100,"compound":598,"dep":196,"advcl":152},"4":{"ROOT":7050}}�cfg��neg_key�
senter/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c4041306bc68fa4038c05676bd86c4efb4f54f8527eecdee0848625398d79c09
3
  size 190447
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d9fae2019492ddc4015a4779f8bcccc539e33e75b3a5987fd4c7e1a9b3109abf
3
  size 190447
tok2vec/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:12c2a49988bde947980a8fca6448bee9ed02554273da176f6e869cea2f69abdb
3
  size 6009091
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:03d8b91d52feaf870e0274b308d51f16888184475719544be4dfca28e68a6a40
3
  size 6009091
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:056e9c81ef594430afdb87d434bb9926610548e24145c2350a413dde52ee7865
3
- size 1603226
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e1bf573690c4ec067dd67641c1dd1d61cff10480e193d93d94ba3f305bfc2d58
3
+ size 1604157