Adriane Boyd commited on
Commit
62d5a10
1 Parent(s): 102bb34

Update spaCy pipeline

Browse files
.gitattributes CHANGED
@@ -19,3 +19,4 @@
19
  *strings.json filter=lfs diff=lfs merge=lfs -text
20
  vectors filter=lfs diff=lfs merge=lfs -text
21
  model filter=lfs diff=lfs merge=lfs -text
 
 
19
  *strings.json filter=lfs diff=lfs merge=lfs -text
20
  vectors filter=lfs diff=lfs merge=lfs -text
21
  model filter=lfs diff=lfs merge=lfs -text
22
+ vocab/key2row filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -14,13 +14,13 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.7384196185
18
  - name: NER Recall
19
  type: recall
20
- value: 0.6817610063
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.7089601046
24
  - task:
25
  name: TAG
26
  type: token-classification
@@ -34,7 +34,7 @@ model-index:
34
  metrics:
35
  - name: POS (UPOS) Accuracy
36
  type: accuracy
37
- value: 0.9708372531
38
  - task:
39
  name: MORPH
40
  type: token-classification
@@ -48,28 +48,28 @@ model-index:
48
  metrics:
49
  - name: Lemma Accuracy
50
  type: accuracy
51
- value: 0.965013864
52
  - task:
53
  name: UNLABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
  - name: Unlabeled Attachment Score (UAS)
57
  type: f_score
58
- value: 0.9211237169
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
  - name: Labeled Attachment Score (LAS)
64
  type: f_score
65
- value: 0.9075282365
66
  - task:
67
  name: SENTS
68
  type: token-classification
69
  metrics:
70
  - name: Sentences F-Score
71
  type: f_score
72
- value: 0.9940828402
73
  ---
74
  ### Details: https://spacy.io/models/ja#ja_core_news_md
75
 
@@ -78,8 +78,8 @@ Japanese pipeline optimized for CPU. Components: tok2vec, morphologizer, parser,
78
  | Feature | Description |
79
  | --- | --- |
80
  | **Name** | `ja_core_news_md` |
81
- | **Version** | `3.3.0` |
82
- | **spaCy** | `>=3.3.0.dev0,<3.4.0` |
83
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `ner` |
84
  | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `ner` |
85
  | **Vectors** | 480443 keys, 20000 unique vectors (300 dimensions) |
@@ -91,11 +91,11 @@ Japanese pipeline optimized for CPU. Components: tok2vec, morphologizer, parser,
91
 
92
  <details>
93
 
94
- <summary>View label scheme (64 labels for 3 components)</summary>
95
 
96
  | Component | Labels |
97
  | --- | --- |
98
- | **`morphologizer`** | `POS=NOUN`, `POS=ADP`, `POS=VERB`, `POS=SCONJ`, `POS=AUX`, `POS=PUNCT`, `POS=PART`, `POS=DET`, `POS=NUM`, `POS=ADV`, `POS=PRON`, `POS=ADJ`, `POS=PROPN`, `POS=CCONJ`, `POS=SYM`, `POS=NOUN\|Polarity=Neg`, `POS=AUX\|Polarity=Neg`, `POS=INTJ`, `POS=SCONJ\|Polarity=Neg` |
99
  | **`parser`** | `ROOT`, `acl`, `advcl`, `advmod`, `amod`, `aux`, `case`, `cc`, `ccomp`, `compound`, `cop`, `csubj`, `dep`, `det`, `dislocated`, `fixed`, `mark`, `nmod`, `nsubj`, `nummod`, `obj`, `obl`, `punct` |
100
  | **`ner`** | `CARDINAL`, `DATE`, `EVENT`, `FAC`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `MOVEMENT`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PET_NAME`, `PHONE`, `PRODUCT`, `QUANTITY`, `TIME`, `TITLE_AFFIX`, `WORK_OF_ART` |
101
 
@@ -109,18 +109,18 @@ Japanese pipeline optimized for CPU. Components: tok2vec, morphologizer, parser,
109
  | `TOKEN_P` | 97.65 |
110
  | `TOKEN_R` | 97.90 |
111
  | `TOKEN_F` | 97.77 |
112
- | `POS_ACC` | 97.08 |
113
  | `MORPH_ACC` | 0.00 |
114
  | `MORPH_MICRO_P` | 34.01 |
115
  | `MORPH_MICRO_R` | 98.04 |
116
  | `MORPH_MICRO_F` | 50.51 |
117
- | `SENTS_P` | 99.41 |
118
- | `SENTS_R` | 99.41 |
119
- | `SENTS_F` | 99.41 |
120
- | `DEP_UAS` | 92.11 |
121
- | `DEP_LAS` | 90.75 |
122
  | `TAG_ACC` | 97.12 |
123
- | `LEMMA_ACC` | 96.50 |
124
- | `ENTS_P` | 73.84 |
125
- | `ENTS_R` | 68.18 |
126
- | `ENTS_F` | 70.90 |
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.7244623656
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.6779874214
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.7004548408
24
  - task:
25
  name: TAG
26
  type: token-classification
 
34
  metrics:
35
  - name: POS (UPOS) Accuracy
36
  type: accuracy
37
+ value: 0.9733311518
38
  - task:
39
  name: MORPH
40
  type: token-classification
 
48
  metrics:
49
  - name: Lemma Accuracy
50
  type: accuracy
51
+ value: 0.9670526831
52
  - task:
53
  name: UNLABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
  - name: Unlabeled Attachment Score (UAS)
57
  type: f_score
58
+ value: 0.9198487032
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
  - name: Labeled Attachment Score (LAS)
64
  type: f_score
65
+ value: 0.9061782838
66
  - task:
67
  name: SENTS
68
  type: token-classification
69
  metrics:
70
  - name: Sentences F-Score
71
  type: f_score
72
+ value: 0.9901380671
73
  ---
74
  ### Details: https://spacy.io/models/ja#ja_core_news_md
75
 
 
78
  | Feature | Description |
79
  | --- | --- |
80
  | **Name** | `ja_core_news_md` |
81
+ | **Version** | `3.4.0` |
82
+ | **spaCy** | `>=3.4.0,<3.5.0` |
83
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `ner` |
84
  | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `ner` |
85
  | **Vectors** | 480443 keys, 20000 unique vectors (300 dimensions) |
 
91
 
92
  <details>
93
 
94
+ <summary>View label scheme (65 labels for 3 components)</summary>
95
 
96
  | Component | Labels |
97
  | --- | --- |
98
+ | **`morphologizer`** | `POS=NOUN`, `POS=ADP`, `POS=VERB`, `POS=SCONJ`, `POS=AUX`, `POS=PUNCT`, `POS=PART`, `POS=DET`, `POS=NUM`, `POS=ADV`, `POS=PRON`, `POS=ADJ`, `POS=PROPN`, `POS=CCONJ`, `POS=SYM`, `POS=NOUN\|Polarity=Neg`, `POS=AUX\|Polarity=Neg`, `POS=SPACE`, `POS=INTJ`, `POS=SCONJ\|Polarity=Neg` |
99
  | **`parser`** | `ROOT`, `acl`, `advcl`, `advmod`, `amod`, `aux`, `case`, `cc`, `ccomp`, `compound`, `cop`, `csubj`, `dep`, `det`, `dislocated`, `fixed`, `mark`, `nmod`, `nsubj`, `nummod`, `obj`, `obl`, `punct` |
100
  | **`ner`** | `CARDINAL`, `DATE`, `EVENT`, `FAC`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `MOVEMENT`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PET_NAME`, `PHONE`, `PRODUCT`, `QUANTITY`, `TIME`, `TITLE_AFFIX`, `WORK_OF_ART` |
101
 
 
109
  | `TOKEN_P` | 97.65 |
110
  | `TOKEN_R` | 97.90 |
111
  | `TOKEN_F` | 97.77 |
112
+ | `POS_ACC` | 97.33 |
113
  | `MORPH_ACC` | 0.00 |
114
  | `MORPH_MICRO_P` | 34.01 |
115
  | `MORPH_MICRO_R` | 98.04 |
116
  | `MORPH_MICRO_F` | 50.51 |
117
+ | `SENTS_P` | 99.01 |
118
+ | `SENTS_R` | 99.01 |
119
+ | `SENTS_F` | 99.01 |
120
+ | `DEP_UAS` | 91.98 |
121
+ | `DEP_LAS` | 90.62 |
122
  | `TAG_ACC` | 97.12 |
123
+ | `LEMMA_ACC` | 96.71 |
124
+ | `ENTS_P` | 72.45 |
125
+ | `ENTS_R` | 67.80 |
126
+ | `ENTS_F` | 70.05 |
accuracy.json CHANGED
@@ -3,7 +3,7 @@
3
  "token_p": 0.9764591282,
4
  "token_r": 0.9790021974,
5
  "token_f": 0.9777290092,
6
- "pos_acc": 0.9708372531,
7
  "morph_acc": 0.0,
8
  "morph_micro_p": 0.3401360544,
9
  "morph_micro_r": 0.9803921569,
@@ -25,91 +25,91 @@
25
  "f": 0.0
26
  }
27
  },
28
- "sents_p": 0.9940828402,
29
- "sents_r": 0.9940828402,
30
- "sents_f": 0.9940828402,
31
- "dep_uas": 0.9211237169,
32
- "dep_las": 0.9075282365,
33
  "dep_las_per_type": {
34
  "cc": {
35
- "p": 0.8125,
36
- "r": 0.8125,
37
- "f": 0.8125
38
  },
39
  "compound": {
40
- "p": 0.9490930369,
41
- "r": 0.9143179256,
42
- "f": 0.9313809934
43
  },
44
  "obl": {
45
- "p": 0.80875,
46
- "r": 0.8077403246,
47
- "f": 0.808244847
48
  },
49
  "case": {
50
- "p": 0.9884925201,
51
- "r": 0.9791033435,
52
- "f": 0.9837755297
53
  },
54
  "dislocated": {
55
- "p": 0.6666666667,
56
- "r": 0.4615384615,
57
- "f": 0.5454545455
58
  },
59
  "nsubj": {
60
- "p": 0.8146718147,
61
- "r": 0.8099808061,
62
- "f": 0.812319538
63
  },
64
  "nmod": {
65
- "p": 0.8869565217,
66
- "r": 0.8350877193,
67
- "f": 0.8602409639
68
  },
69
  "root": {
70
- "p": 0.966,
71
  "r": 0.9526627219,
72
- "f": 0.959285005
73
  },
74
  "aux": {
75
- "p": 0.9768732655,
76
- "r": 0.9805013928,
77
- "f": 0.9786839666
78
  },
79
  "advcl": {
80
- "p": 0.6876404494,
81
- "r": 0.6876404494,
82
- "f": 0.6876404494
83
  },
84
  "mark": {
85
- "p": 0.9696969697,
86
  "r": 0.96,
87
- "f": 0.9648241206
88
  },
89
  "fixed": {
90
- "p": 0.9566003617,
91
- "r": 0.9618181818,
92
- "f": 0.9592021759
93
  },
94
  "acl": {
95
- "p": 0.8278867102,
96
- "r": 0.8351648352,
97
- "f": 0.8315098468
98
  },
99
  "obj": {
100
- "p": 0.9456193353,
101
- "r": 0.9456193353,
102
- "f": 0.9456193353
103
  },
104
  "nummod": {
105
- "p": 0.9325153374,
106
- "r": 0.899408284,
107
- "f": 0.9156626506
108
  },
109
  "advmod": {
110
- "p": 0.7054263566,
111
- "r": 0.65,
112
- "f": 0.6765799257
113
  },
114
  "amod": {
115
  "p": 0.9642857143,
@@ -117,9 +117,9 @@
117
  "f": 0.8307692308
118
  },
119
  "cop": {
120
- "p": 0.9700598802,
121
- "r": 0.9418604651,
122
- "f": 0.9557522124
123
  },
124
  "ccomp": {
125
  "p": 0.95,
@@ -127,56 +127,56 @@
127
  "f": 0.9047619048
128
  },
129
  "det": {
130
- "p": 0.9807692308,
131
- "r": 0.9622641509,
132
- "f": 0.9714285714
133
  },
134
  "csubj": {
135
- "p": 0.6153846154,
136
- "r": 0.6666666667,
137
- "f": 0.64
138
  },
139
  "dep": {
140
- "p": 0.0,
141
- "r": 0.0,
142
- "f": 0.0
143
  }
144
  },
145
  "tag_acc": 0.9712488769,
146
- "lemma_acc": 0.965013864,
147
- "ents_p": 0.7384196185,
148
- "ents_r": 0.6817610063,
149
- "ents_f": 0.7089601046,
150
  "ents_per_type": {
151
  "DATE": {
152
- "p": 0.9722222222,
153
  "r": 0.9633027523,
154
- "f": 0.9677419355
 
 
 
 
 
155
  },
156
  "ORG": {
157
- "p": 0.6097560976,
158
- "r": 0.5474452555,
159
- "f": 0.5769230769
160
  },
161
  "TITLE_AFFIX": {
162
- "p": 0.8636363636,
163
- "r": 0.6333333333,
164
- "f": 0.7307692308
165
- },
166
- "PERSON": {
167
- "p": 0.762962963,
168
- "r": 0.7410071942,
169
- "f": 0.7518248175
170
  },
171
  "GPE": {
172
- "p": 0.7282608696,
173
- "r": 0.7127659574,
174
- "f": 0.7204301075
175
  },
176
  "PRODUCT": {
177
- "p": 0.4375,
178
- "r": 0.3333333333,
179
- "f": 0.3783783784
180
  },
181
  "TIME": {
182
  "p": 0.6666666667,
@@ -184,9 +184,9 @@
184
  "f": 0.8
185
  },
186
  "QUANTITY": {
187
- "p": 0.8769230769,
188
- "r": 0.8636363636,
189
- "f": 0.8702290076
190
  },
191
  "NORP": {
192
  "p": 0.72,
@@ -194,19 +194,19 @@
194
  "f": 0.6315789474
195
  },
196
  "ORDINAL": {
197
- "p": 0.5172413793,
198
- "r": 0.6818181818,
199
- "f": 0.5882352941
200
  },
201
  "WORK_OF_ART": {
202
- "p": 0.5789473684,
203
- "r": 0.6470588235,
204
- "f": 0.6111111111
205
  },
206
  "FAC": {
207
- "p": 0.625,
208
- "r": 0.4054054054,
209
- "f": 0.4918032787
210
  },
211
  "PERCENT": {
212
  "p": 1.0,
@@ -214,9 +214,9 @@
214
  "f": 0.4444444444
215
  },
216
  "EVENT": {
217
- "p": 0.7619047619,
218
- "r": 0.6153846154,
219
- "f": 0.6808510638
220
  },
221
  "CARDINAL": {
222
  "p": 0.0,
@@ -224,9 +224,9 @@
224
  "f": 0.0
225
  },
226
  "LOC": {
227
- "p": 0.5384615385,
228
- "r": 0.7,
229
- "f": 0.6086956522
230
  },
231
  "MOVEMENT": {
232
  "p": 0.0,
@@ -238,11 +238,6 @@
238
  "r": 0.3333333333,
239
  "f": 0.5
240
  },
241
- "PET_NAME": {
242
- "p": 0.0,
243
- "r": 0.0,
244
- "f": 0.0
245
- },
246
  "MONEY": {
247
  "p": 1.0,
248
  "r": 1.0,
@@ -252,7 +247,12 @@
252
  "p": 1.0,
253
  "r": 1.0,
254
  "f": 1.0
 
 
 
 
 
255
  }
256
  },
257
- "speed": 8044.6200194392
258
  }
 
3
  "token_p": 0.9764591282,
4
  "token_r": 0.9790021974,
5
  "token_f": 0.9777290092,
6
+ "pos_acc": 0.9733311518,
7
  "morph_acc": 0.0,
8
  "morph_micro_p": 0.3401360544,
9
  "morph_micro_r": 0.9803921569,
 
25
  "f": 0.0
26
  }
27
  },
28
+ "sents_p": 0.9901380671,
29
+ "sents_r": 0.9901380671,
30
+ "sents_f": 0.9901380671,
31
+ "dep_uas": 0.9198487032,
32
+ "dep_las": 0.9061782838,
33
  "dep_las_per_type": {
34
  "cc": {
35
+ "p": 0.7115384615,
36
+ "r": 0.7708333333,
37
+ "f": 0.74
38
  },
39
  "compound": {
40
+ "p": 0.9336051252,
41
+ "r": 0.9036076663,
42
+ "f": 0.918361501
43
  },
44
  "obl": {
45
+ "p": 0.8171500631,
46
+ "r": 0.808988764,
47
+ "f": 0.8130489335
48
  },
49
  "case": {
50
+ "p": 0.9892720307,
51
+ "r": 0.9810030395,
52
+ "f": 0.9851201831
53
  },
54
  "dislocated": {
55
+ "p": 0.6428571429,
56
+ "r": 0.6923076923,
57
+ "f": 0.6666666667
58
  },
59
  "nsubj": {
60
+ "p": 0.8233009709,
61
+ "r": 0.8138195777,
62
+ "f": 0.8185328185
63
  },
64
  "nmod": {
65
+ "p": 0.8746898263,
66
+ "r": 0.8245614035,
67
+ "f": 0.8488862131
68
  },
69
  "root": {
70
+ "p": 0.9679358717,
71
  "r": 0.9526627219,
72
+ "f": 0.9602385686
73
  },
74
  "aux": {
75
+ "p": 0.9787037037,
76
+ "r": 0.9814298979,
77
+ "f": 0.980064905
78
  },
79
  "advcl": {
80
+ "p": 0.6944444444,
81
+ "r": 0.6741573034,
82
+ "f": 0.6841505131
83
  },
84
  "mark": {
85
+ "p": 0.9775967413,
86
  "r": 0.96,
87
+ "f": 0.9687184662
88
  },
89
  "fixed": {
90
+ "p": 0.9588550984,
91
+ "r": 0.9745454545,
92
+ "f": 0.9666366096
93
  },
94
  "acl": {
95
+ "p": 0.8315334773,
96
+ "r": 0.8461538462,
97
+ "f": 0.8387799564
98
  },
99
  "obj": {
100
+ "p": 0.9480122324,
101
+ "r": 0.9365558912,
102
+ "f": 0.9422492401
103
  },
104
  "nummod": {
105
+ "p": 0.9805194805,
106
+ "r": 0.8934911243,
107
+ "f": 0.9349845201
108
  },
109
  "advmod": {
110
+ "p": 0.6917293233,
111
+ "r": 0.6571428571,
112
+ "f": 0.673992674
113
  },
114
  "amod": {
115
  "p": 0.9642857143,
 
117
  "f": 0.8307692308
118
  },
119
  "cop": {
120
+ "p": 0.9523809524,
121
+ "r": 0.9302325581,
122
+ "f": 0.9411764706
123
  },
124
  "ccomp": {
125
  "p": 0.95,
 
127
  "f": 0.9047619048
128
  },
129
  "det": {
130
+ "p": 0.9615384615,
131
+ "r": 0.9433962264,
132
+ "f": 0.9523809524
133
  },
134
  "csubj": {
135
+ "p": 0.7142857143,
136
+ "r": 0.8333333333,
137
+ "f": 0.7692307692
138
  },
139
  "dep": {
140
+ "p": 0.2,
141
+ "r": 0.1428571429,
142
+ "f": 0.1666666667
143
  }
144
  },
145
  "tag_acc": 0.9712488769,
146
+ "lemma_acc": 0.9670526831,
147
+ "ents_p": 0.7244623656,
148
+ "ents_r": 0.6779874214,
149
+ "ents_f": 0.7004548408,
150
  "ents_per_type": {
151
  "DATE": {
152
+ "p": 0.9545454545,
153
  "r": 0.9633027523,
154
+ "f": 0.9589041096
155
+ },
156
+ "PERSON": {
157
+ "p": 0.7152777778,
158
+ "r": 0.7410071942,
159
+ "f": 0.7279151943
160
  },
161
  "ORG": {
162
+ "p": 0.6333333333,
163
+ "r": 0.5547445255,
164
+ "f": 0.5914396887
165
  },
166
  "TITLE_AFFIX": {
167
+ "p": 0.8333333333,
168
+ "r": 0.6666666667,
169
+ "f": 0.7407407407
 
 
 
 
 
170
  },
171
  "GPE": {
172
+ "p": 0.6741573034,
173
+ "r": 0.6382978723,
174
+ "f": 0.6557377049
175
  },
176
  "PRODUCT": {
177
+ "p": 0.3636363636,
178
+ "r": 0.2857142857,
179
+ "f": 0.32
180
  },
181
  "TIME": {
182
  "p": 0.6666666667,
 
184
  "f": 0.8
185
  },
186
  "QUANTITY": {
187
+ "p": 0.8194444444,
188
+ "r": 0.8939393939,
189
+ "f": 0.8550724638
190
  },
191
  "NORP": {
192
  "p": 0.72,
 
194
  "f": 0.6315789474
195
  },
196
  "ORDINAL": {
197
+ "p": 0.5833333333,
198
+ "r": 0.6363636364,
199
+ "f": 0.6086956522
200
  },
201
  "WORK_OF_ART": {
202
+ "p": 0.6842105263,
203
+ "r": 0.7647058824,
204
+ "f": 0.7222222222
205
  },
206
  "FAC": {
207
+ "p": 0.6538461538,
208
+ "r": 0.4594594595,
209
+ "f": 0.5396825397
210
  },
211
  "PERCENT": {
212
  "p": 1.0,
 
214
  "f": 0.4444444444
215
  },
216
  "EVENT": {
217
+ "p": 0.7368421053,
218
+ "r": 0.5384615385,
219
+ "f": 0.6222222222
220
  },
221
  "CARDINAL": {
222
  "p": 0.0,
 
224
  "f": 0.0
225
  },
226
  "LOC": {
227
+ "p": 0.6153846154,
228
+ "r": 0.8,
229
+ "f": 0.6956521739
230
  },
231
  "MOVEMENT": {
232
  "p": 0.0,
 
238
  "r": 0.3333333333,
239
  "f": 0.5
240
  },
 
 
 
 
 
241
  "MONEY": {
242
  "p": 1.0,
243
  "r": 1.0,
 
247
  "p": 1.0,
248
  "r": 1.0,
249
  "f": 1.0
250
+ },
251
+ "PET_NAME": {
252
+ "p": 0.0,
253
+ "r": 0.0,
254
+ "f": 0.0
255
  }
256
  },
257
+ "speed": 7222.6711221355
258
  }
ja_core_news_md-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:80bc4423a0a9d8ce761c25750c28885c055caa83bb4ae45b14937975f9b63bcb
3
- size 41986714
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:df45e0c8206a077ef983bb413aca7b81b29bba454856225d2f6f0fe66b27c632
3
+ size 41990789
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"ja",
3
  "name":"core_news_md",
4
- "version":"3.3.0",
5
  "description":"Japanese pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
- "spacy_version":">=3.3.0.dev0,<3.4.0",
11
- "spacy_git_version":"849bef2de",
12
  "vectors":{
13
  "width":300,
14
  "vectors":20000,
@@ -37,6 +37,7 @@
37
  "POS=SYM",
38
  "POS=NOUN|Polarity=Neg",
39
  "POS=AUX|Polarity=Neg",
 
40
  "POS=INTJ",
41
  "POS=SCONJ|Polarity=Neg"
42
  ],
@@ -116,7 +117,7 @@
116
  "token_p":0.9764591282,
117
  "token_r":0.9790021974,
118
  "token_f":0.9777290092,
119
- "pos_acc":0.9708372531,
120
  "morph_acc":0.0,
121
  "morph_micro_p":0.3401360544,
122
  "morph_micro_r":0.9803921569,
@@ -138,91 +139,91 @@
138
  "f":0.0
139
  }
140
  },
141
- "sents_p":0.9940828402,
142
- "sents_r":0.9940828402,
143
- "sents_f":0.9940828402,
144
- "dep_uas":0.9211237169,
145
- "dep_las":0.9075282365,
146
  "dep_las_per_type":{
147
  "cc":{
148
- "p":0.8125,
149
- "r":0.8125,
150
- "f":0.8125
151
  },
152
  "compound":{
153
- "p":0.9490930369,
154
- "r":0.9143179256,
155
- "f":0.9313809934
156
  },
157
  "obl":{
158
- "p":0.80875,
159
- "r":0.8077403246,
160
- "f":0.808244847
161
  },
162
  "case":{
163
- "p":0.9884925201,
164
- "r":0.9791033435,
165
- "f":0.9837755297
166
  },
167
  "dislocated":{
168
- "p":0.6666666667,
169
- "r":0.4615384615,
170
- "f":0.5454545455
171
  },
172
  "nsubj":{
173
- "p":0.8146718147,
174
- "r":0.8099808061,
175
- "f":0.812319538
176
  },
177
  "nmod":{
178
- "p":0.8869565217,
179
- "r":0.8350877193,
180
- "f":0.8602409639
181
  },
182
  "root":{
183
- "p":0.966,
184
  "r":0.9526627219,
185
- "f":0.959285005
186
  },
187
  "aux":{
188
- "p":0.9768732655,
189
- "r":0.9805013928,
190
- "f":0.9786839666
191
  },
192
  "advcl":{
193
- "p":0.6876404494,
194
- "r":0.6876404494,
195
- "f":0.6876404494
196
  },
197
  "mark":{
198
- "p":0.9696969697,
199
  "r":0.96,
200
- "f":0.9648241206
201
  },
202
  "fixed":{
203
- "p":0.9566003617,
204
- "r":0.9618181818,
205
- "f":0.9592021759
206
  },
207
  "acl":{
208
- "p":0.8278867102,
209
- "r":0.8351648352,
210
- "f":0.8315098468
211
  },
212
  "obj":{
213
- "p":0.9456193353,
214
- "r":0.9456193353,
215
- "f":0.9456193353
216
  },
217
  "nummod":{
218
- "p":0.9325153374,
219
- "r":0.899408284,
220
- "f":0.9156626506
221
  },
222
  "advmod":{
223
- "p":0.7054263566,
224
- "r":0.65,
225
- "f":0.6765799257
226
  },
227
  "amod":{
228
  "p":0.9642857143,
@@ -230,9 +231,9 @@
230
  "f":0.8307692308
231
  },
232
  "cop":{
233
- "p":0.9700598802,
234
- "r":0.9418604651,
235
- "f":0.9557522124
236
  },
237
  "ccomp":{
238
  "p":0.95,
@@ -240,56 +241,56 @@
240
  "f":0.9047619048
241
  },
242
  "det":{
243
- "p":0.9807692308,
244
- "r":0.9622641509,
245
- "f":0.9714285714
246
  },
247
  "csubj":{
248
- "p":0.6153846154,
249
- "r":0.6666666667,
250
- "f":0.64
251
  },
252
  "dep":{
253
- "p":0.0,
254
- "r":0.0,
255
- "f":0.0
256
  }
257
  },
258
  "tag_acc":0.9712488769,
259
- "lemma_acc":0.965013864,
260
- "ents_p":0.7384196185,
261
- "ents_r":0.6817610063,
262
- "ents_f":0.7089601046,
263
  "ents_per_type":{
264
  "DATE":{
265
- "p":0.9722222222,
266
  "r":0.9633027523,
267
- "f":0.9677419355
 
 
 
 
 
268
  },
269
  "ORG":{
270
- "p":0.6097560976,
271
- "r":0.5474452555,
272
- "f":0.5769230769
273
  },
274
  "TITLE_AFFIX":{
275
- "p":0.8636363636,
276
- "r":0.6333333333,
277
- "f":0.7307692308
278
- },
279
- "PERSON":{
280
- "p":0.762962963,
281
- "r":0.7410071942,
282
- "f":0.7518248175
283
  },
284
  "GPE":{
285
- "p":0.7282608696,
286
- "r":0.7127659574,
287
- "f":0.7204301075
288
  },
289
  "PRODUCT":{
290
- "p":0.4375,
291
- "r":0.3333333333,
292
- "f":0.3783783784
293
  },
294
  "TIME":{
295
  "p":0.6666666667,
@@ -297,9 +298,9 @@
297
  "f":0.8
298
  },
299
  "QUANTITY":{
300
- "p":0.8769230769,
301
- "r":0.8636363636,
302
- "f":0.8702290076
303
  },
304
  "NORP":{
305
  "p":0.72,
@@ -307,19 +308,19 @@
307
  "f":0.6315789474
308
  },
309
  "ORDINAL":{
310
- "p":0.5172413793,
311
- "r":0.6818181818,
312
- "f":0.5882352941
313
  },
314
  "WORK_OF_ART":{
315
- "p":0.5789473684,
316
- "r":0.6470588235,
317
- "f":0.6111111111
318
  },
319
  "FAC":{
320
- "p":0.625,
321
- "r":0.4054054054,
322
- "f":0.4918032787
323
  },
324
  "PERCENT":{
325
  "p":1.0,
@@ -327,9 +328,9 @@
327
  "f":0.4444444444
328
  },
329
  "EVENT":{
330
- "p":0.7619047619,
331
- "r":0.6153846154,
332
- "f":0.6808510638
333
  },
334
  "CARDINAL":{
335
  "p":0.0,
@@ -337,9 +338,9 @@
337
  "f":0.0
338
  },
339
  "LOC":{
340
- "p":0.5384615385,
341
- "r":0.7,
342
- "f":0.6086956522
343
  },
344
  "MOVEMENT":{
345
  "p":0.0,
@@ -351,11 +352,6 @@
351
  "r":0.3333333333,
352
  "f":0.5
353
  },
354
- "PET_NAME":{
355
- "p":0.0,
356
- "r":0.0,
357
- "f":0.0
358
- },
359
  "MONEY":{
360
  "p":1.0,
361
  "r":1.0,
@@ -365,9 +361,14 @@
365
  "p":1.0,
366
  "r":1.0,
367
  "f":1.0
 
 
 
 
 
368
  }
369
  },
370
- "speed":8044.6200194392
371
  },
372
  "sources":[
373
  {
 
1
  {
2
  "lang":"ja",
3
  "name":"core_news_md",
4
+ "version":"3.4.0",
5
  "description":"Japanese pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
+ "spacy_version":">=3.4.0,<3.5.0",
11
+ "spacy_git_version":"dd038b536",
12
  "vectors":{
13
  "width":300,
14
  "vectors":20000,
 
37
  "POS=SYM",
38
  "POS=NOUN|Polarity=Neg",
39
  "POS=AUX|Polarity=Neg",
40
+ "POS=SPACE",
41
  "POS=INTJ",
42
  "POS=SCONJ|Polarity=Neg"
43
  ],
 
117
  "token_p":0.9764591282,
118
  "token_r":0.9790021974,
119
  "token_f":0.9777290092,
120
+ "pos_acc":0.9733311518,
121
  "morph_acc":0.0,
122
  "morph_micro_p":0.3401360544,
123
  "morph_micro_r":0.9803921569,
 
139
  "f":0.0
140
  }
141
  },
142
+ "sents_p":0.9901380671,
143
+ "sents_r":0.9901380671,
144
+ "sents_f":0.9901380671,
145
+ "dep_uas":0.9198487032,
146
+ "dep_las":0.9061782838,
147
  "dep_las_per_type":{
148
  "cc":{
149
+ "p":0.7115384615,
150
+ "r":0.7708333333,
151
+ "f":0.74
152
  },
153
  "compound":{
154
+ "p":0.9336051252,
155
+ "r":0.9036076663,
156
+ "f":0.918361501
157
  },
158
  "obl":{
159
+ "p":0.8171500631,
160
+ "r":0.808988764,
161
+ "f":0.8130489335
162
  },
163
  "case":{
164
+ "p":0.9892720307,
165
+ "r":0.9810030395,
166
+ "f":0.9851201831
167
  },
168
  "dislocated":{
169
+ "p":0.6428571429,
170
+ "r":0.6923076923,
171
+ "f":0.6666666667
172
  },
173
  "nsubj":{
174
+ "p":0.8233009709,
175
+ "r":0.8138195777,
176
+ "f":0.8185328185
177
  },
178
  "nmod":{
179
+ "p":0.8746898263,
180
+ "r":0.8245614035,
181
+ "f":0.8488862131
182
  },
183
  "root":{
184
+ "p":0.9679358717,
185
  "r":0.9526627219,
186
+ "f":0.9602385686
187
  },
188
  "aux":{
189
+ "p":0.9787037037,
190
+ "r":0.9814298979,
191
+ "f":0.980064905
192
  },
193
  "advcl":{
194
+ "p":0.6944444444,
195
+ "r":0.6741573034,
196
+ "f":0.6841505131
197
  },
198
  "mark":{
199
+ "p":0.9775967413,
200
  "r":0.96,
201
+ "f":0.9687184662
202
  },
203
  "fixed":{
204
+ "p":0.9588550984,
205
+ "r":0.9745454545,
206
+ "f":0.9666366096
207
  },
208
  "acl":{
209
+ "p":0.8315334773,
210
+ "r":0.8461538462,
211
+ "f":0.8387799564
212
  },
213
  "obj":{
214
+ "p":0.9480122324,
215
+ "r":0.9365558912,
216
+ "f":0.9422492401
217
  },
218
  "nummod":{
219
+ "p":0.9805194805,
220
+ "r":0.8934911243,
221
+ "f":0.9349845201
222
  },
223
  "advmod":{
224
+ "p":0.6917293233,
225
+ "r":0.6571428571,
226
+ "f":0.673992674
227
  },
228
  "amod":{
229
  "p":0.9642857143,
 
231
  "f":0.8307692308
232
  },
233
  "cop":{
234
+ "p":0.9523809524,
235
+ "r":0.9302325581,
236
+ "f":0.9411764706
237
  },
238
  "ccomp":{
239
  "p":0.95,
 
241
  "f":0.9047619048
242
  },
243
  "det":{
244
+ "p":0.9615384615,
245
+ "r":0.9433962264,
246
+ "f":0.9523809524
247
  },
248
  "csubj":{
249
+ "p":0.7142857143,
250
+ "r":0.8333333333,
251
+ "f":0.7692307692
252
  },
253
  "dep":{
254
+ "p":0.2,
255
+ "r":0.1428571429,
256
+ "f":0.1666666667
257
  }
258
  },
259
  "tag_acc":0.9712488769,
260
+ "lemma_acc":0.9670526831,
261
+ "ents_p":0.7244623656,
262
+ "ents_r":0.6779874214,
263
+ "ents_f":0.7004548408,
264
  "ents_per_type":{
265
  "DATE":{
266
+ "p":0.9545454545,
267
  "r":0.9633027523,
268
+ "f":0.9589041096
269
+ },
270
+ "PERSON":{
271
+ "p":0.7152777778,
272
+ "r":0.7410071942,
273
+ "f":0.7279151943
274
  },
275
  "ORG":{
276
+ "p":0.6333333333,
277
+ "r":0.5547445255,
278
+ "f":0.5914396887
279
  },
280
  "TITLE_AFFIX":{
281
+ "p":0.8333333333,
282
+ "r":0.6666666667,
283
+ "f":0.7407407407
 
 
 
 
 
284
  },
285
  "GPE":{
286
+ "p":0.6741573034,
287
+ "r":0.6382978723,
288
+ "f":0.6557377049
289
  },
290
  "PRODUCT":{
291
+ "p":0.3636363636,
292
+ "r":0.2857142857,
293
+ "f":0.32
294
  },
295
  "TIME":{
296
  "p":0.6666666667,
 
298
  "f":0.8
299
  },
300
  "QUANTITY":{
301
+ "p":0.8194444444,
302
+ "r":0.8939393939,
303
+ "f":0.8550724638
304
  },
305
  "NORP":{
306
  "p":0.72,
 
308
  "f":0.6315789474
309
  },
310
  "ORDINAL":{
311
+ "p":0.5833333333,
312
+ "r":0.6363636364,
313
+ "f":0.6086956522
314
  },
315
  "WORK_OF_ART":{
316
+ "p":0.6842105263,
317
+ "r":0.7647058824,
318
+ "f":0.7222222222
319
  },
320
  "FAC":{
321
+ "p":0.6538461538,
322
+ "r":0.4594594595,
323
+ "f":0.5396825397
324
  },
325
  "PERCENT":{
326
  "p":1.0,
 
328
  "f":0.4444444444
329
  },
330
  "EVENT":{
331
+ "p":0.7368421053,
332
+ "r":0.5384615385,
333
+ "f":0.6222222222
334
  },
335
  "CARDINAL":{
336
  "p":0.0,
 
338
  "f":0.0
339
  },
340
  "LOC":{
341
+ "p":0.6153846154,
342
+ "r":0.8,
343
+ "f":0.6956521739
344
  },
345
  "MOVEMENT":{
346
  "p":0.0,
 
352
  "r":0.3333333333,
353
  "f":0.5
354
  },
 
 
 
 
 
355
  "MONEY":{
356
  "p":1.0,
357
  "r":1.0,
 
361
  "p":1.0,
362
  "r":1.0,
363
  "f":1.0
364
+ },
365
+ "PET_NAME":{
366
+ "p":0.0,
367
+ "r":0.0,
368
+ "f":0.0
369
  }
370
  },
371
+ "speed":7222.6711221355
372
  },
373
  "sources":[
374
  {
morphologizer/cfg CHANGED
@@ -18,6 +18,7 @@
18
  "POS=SYM":"",
19
  "POS=NOUN|Polarity=Neg":"Polarity=Neg",
20
  "POS=AUX|Polarity=Neg":"Polarity=Neg",
 
21
  "POS=INTJ":"",
22
  "POS=SCONJ|Polarity=Neg":"Polarity=Neg"
23
  },
@@ -39,6 +40,7 @@
39
  "POS=SYM":99,
40
  "POS=NOUN|Polarity=Neg":92,
41
  "POS=AUX|Polarity=Neg":87,
 
42
  "POS=INTJ":91,
43
  "POS=SCONJ|Polarity=Neg":98
44
  },
 
18
  "POS=SYM":"",
19
  "POS=NOUN|Polarity=Neg":"Polarity=Neg",
20
  "POS=AUX|Polarity=Neg":"Polarity=Neg",
21
+ "POS=SPACE":"",
22
  "POS=INTJ":"",
23
  "POS=SCONJ|Polarity=Neg":"Polarity=Neg"
24
  },
 
40
  "POS=SYM":99,
41
  "POS=NOUN|Polarity=Neg":92,
42
  "POS=AUX|Polarity=Neg":87,
43
+ "POS=SPACE":103,
44
  "POS=INTJ":91,
45
  "POS=SCONJ|Polarity=Neg":98
46
  },
morphologizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5489ddf9014098b25974a1783b6b2bcb71cbd118776e7a1fbbf5e8e34756652f
3
- size 7801
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:40a61d03889a18d216ecec75486609a43994214da406a8ba2b09bb0a8b94b20d
3
+ size 8189
ner/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4c1e7e625dc2fa41ce822a430d8745261ddaf0c2a0084c84305847db5e73f993
3
  size 6385103
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:249b11cd2cf32cc4699b57696d3a0543d260f2ff581c141a5513e5c557f0b887
3
  size 6385103
ner/moves CHANGED
@@ -1 +1 @@
1
- ��moves��{"0":{},"1":{"DATE":4200,"ORG":3487,"PERSON":3042,"QUANTITY":2519,"GPE":1953,"PRODUCT":1328,"FAC":1243,"ORDINAL":1114,"WORK_OF_ART":1053,"EVENT":869,"NORP":734,"LOC":563,"MONEY":400,"TITLE_AFFIX":344,"TIME":300,"PERCENT":274,"MOVEMENT":148,"LAW":94,"LANGUAGE":82,"CARDINAL":27,"PET_NAME":20,"PHONE":4},"2":{"DATE":4200,"ORG":3487,"PERSON":3042,"QUANTITY":2519,"GPE":1953,"PRODUCT":1328,"FAC":1243,"ORDINAL":1114,"WORK_OF_ART":1053,"EVENT":869,"NORP":734,"LOC":563,"MONEY":400,"TITLE_AFFIX":344,"TIME":300,"PERCENT":274,"MOVEMENT":148,"LAW":94,"LANGUAGE":82,"CARDINAL":27,"PET_NAME":20,"PHONE":4},"3":{"DATE":4200,"ORG":3487,"PERSON":3042,"QUANTITY":2519,"GPE":1953,"PRODUCT":1328,"FAC":1243,"ORDINAL":1114,"WORK_OF_ART":1053,"EVENT":869,"NORP":734,"LOC":563,"MONEY":400,"TITLE_AFFIX":344,"TIME":300,"PERCENT":274,"MOVEMENT":148,"LAW":94,"LANGUAGE":82,"CARDINAL":27,"PET_NAME":20,"PHONE":4},"4":{"DATE":4200,"ORG":3487,"PERSON":3042,"QUANTITY":2519,"GPE":1953,"PRODUCT":1328,"FAC":1243,"ORDINAL":1114,"WORK_OF_ART":1053,"EVENT":869,"NORP":734,"LOC":563,"MONEY":400,"TITLE_AFFIX":344,"TIME":300,"PERCENT":274,"MOVEMENT":148,"LAW":94,"LANGUAGE":82,"CARDINAL":27,"PET_NAME":20,"PHONE":4,"":1},"5":{"":1}}�cfg��neg_key�
 
1
+ ��moves��{"0":{},"1":{"DATE":4200,"ORG":3488,"PERSON":3043,"QUANTITY":2521,"GPE":1953,"PRODUCT":1328,"FAC":1244,"ORDINAL":1114,"WORK_OF_ART":1053,"EVENT":869,"NORP":735,"LOC":563,"MONEY":400,"TITLE_AFFIX":344,"TIME":300,"PERCENT":274,"MOVEMENT":148,"LAW":94,"LANGUAGE":82,"CARDINAL":27,"PET_NAME":20,"PHONE":4},"2":{"DATE":4200,"ORG":3488,"PERSON":3043,"QUANTITY":2521,"GPE":1953,"PRODUCT":1328,"FAC":1244,"ORDINAL":1114,"WORK_OF_ART":1053,"EVENT":869,"NORP":735,"LOC":563,"MONEY":400,"TITLE_AFFIX":344,"TIME":300,"PERCENT":274,"MOVEMENT":148,"LAW":94,"LANGUAGE":82,"CARDINAL":27,"PET_NAME":20,"PHONE":4},"3":{"DATE":4200,"ORG":3488,"PERSON":3043,"QUANTITY":2521,"GPE":1953,"PRODUCT":1328,"FAC":1244,"ORDINAL":1114,"WORK_OF_ART":1053,"EVENT":869,"NORP":735,"LOC":563,"MONEY":400,"TITLE_AFFIX":344,"TIME":300,"PERCENT":274,"MOVEMENT":148,"LAW":94,"LANGUAGE":82,"CARDINAL":27,"PET_NAME":20,"PHONE":4},"4":{"DATE":4200,"ORG":3488,"PERSON":3043,"QUANTITY":2521,"GPE":1953,"PRODUCT":1328,"FAC":1244,"ORDINAL":1114,"WORK_OF_ART":1053,"EVENT":869,"NORP":735,"LOC":563,"MONEY":400,"TITLE_AFFIX":344,"TIME":300,"PERCENT":274,"MOVEMENT":148,"LAW":94,"LANGUAGE":82,"CARDINAL":27,"PET_NAME":20,"PHONE":4,"":1},"5":{"":1}}�cfg��neg_key�
parser/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0789362978a8d034979025d4b3a086def9e3f44845be928da0d143d457613208
3
  size 299888
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4a834891908c8a11925ec2e94ca6facfafc6f51a681c351de5a44a9c2026e5af
3
  size 299888
parser/moves CHANGED
@@ -1 +1 @@
1
- ��moves�~{"0":{"":77992},"1":{"":83293},"2":{"compound":23506,"nmod":11446,"obl":11030,"nsubj":6884,"advcl":6063,"acl":6020,"obj":4629,"nummod":2487,"advmod":1922,"punct":1321,"det":830,"cc":726,"amod":372,"ccomp":325,"dislocated":235,"csubj":133,"dep":0},"3":{"case":35913,"punct":15455,"aux":14940,"fixed":7391,"mark":6644,"cop":2100,"compound":598,"advcl":152,"dep":58},"4":{"ROOT":7050}}�cfg��neg_key�
 
1
+ ��moves�{"0":{"":77992},"1":{"":83431},"2":{"compound":23506,"nmod":11446,"obl":11030,"nsubj":6884,"advcl":6063,"acl":6020,"obj":4629,"nummod":2487,"advmod":1922,"punct":1321,"det":830,"cc":726,"amod":372,"ccomp":325,"dislocated":235,"csubj":133,"dep":0},"3":{"case":35913,"punct":15455,"aux":14940,"fixed":7391,"mark":6644,"cop":2100,"compound":598,"dep":196,"advcl":152},"4":{"ROOT":7050}}�cfg��neg_key�
senter/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:19c712710d460306df4e72866bd5221d72a27366f3dca83c5d66ec0c55409481
3
  size 213263
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:20165fdde644486787c8444e6a7827e6c2b3977d25dc9f26b96136e953bcff2e
3
  size 213263
tok2vec/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e69d2a9239d6d8c26d8033ff7e2a6d193369dfd5ffa319ef4f8c18a11537e7c9
3
  size 6235418
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6bc127f8ea7af91f3b35f536e0a565690a02b641115a04d3417d717d4db5d02b
3
  size 6235418
vocab/key2row CHANGED
Binary files a/vocab/key2row and b/vocab/key2row differ
 
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:44e5c002847a11120089fb9a481003dfd02492a2c8669ca9625cd0a09c8ab22d
3
- size 15613806
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3a812f79413801d2858a16f71168326eb951a7d0c53a7bda8acde7a3af273859
3
+ size 15615308