osanseviero commited on
Commit
1d33edc
1 Parent(s): b7db18e

Update spaCy pipeline

Browse files
README.md CHANGED
@@ -14,47 +14,47 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.853733758
18
  - name: NER Recall
19
  type: recall
20
- value: 0.8456530449
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.8496741892
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
  - name: POS Accuracy
29
  type: accuracy
30
- value: 0.9727831973
31
  - task:
32
  name: SENTER
33
  type: token-classification
34
  metrics:
35
  - name: SENTER Precision
36
  type: precision
37
- value: 0.9049104721
38
  - name: SENTER Recall
39
  type: recall
40
- value: 0.8801372122
41
  - name: SENTER F Score
42
  type: f_score
43
- value: 0.8923519379
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
  - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
- value: 0.9186878782
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
  - name: Labeled Dependencies Accuracy
56
  type: accuracy
57
- value: 0.9186878782
58
  ---
59
  ### Details: https://spacy.io/models/en#en_core_web_md
60
 
@@ -63,8 +63,8 @@ English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter,
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `en_core_web_md` |
66
- | **Version** | `3.1.0` |
67
- | **spaCy** | `>=3.1.0,<3.2.0` |
68
  | **Default Pipeline** | `tok2vec`, `tagger`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `tagger`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 684830 keys, 20000 unique vectors (300 dimensions) |
@@ -72,16 +72,35 @@ English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter,
72
  | **License** | `MIT` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
75
  ### Accuracy
 
76
  | Type | Score |
77
  | --- | --- |
78
  | `TOKEN_ACC` | 99.93 |
79
- | `TAG_ACC` | 97.28 |
 
 
 
 
 
 
80
  | `DEP_UAS` | 91.87 |
81
- | `DEP_LAS` | 90.05 |
82
- | `ENTS_P` | 85.37 |
83
- | `ENTS_R` | 84.57 |
84
- | `ENTS_F` | 84.97 |
85
- | `SENTS_P` | 90.49 |
86
- | `SENTS_R` | 88.01 |
87
- | `SENTS_F` | 89.24 |
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.8531330602
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.8448016827
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.8489469314
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
  - name: POS Accuracy
29
  type: accuracy
30
+ value: 0.9736958159
31
  - task:
32
  name: SENTER
33
  type: token-classification
34
  metrics:
35
  - name: SENTER Precision
36
  type: precision
37
+ value: 0.9144345238
38
  - name: SENTER Recall
39
  type: recall
40
+ value: 0.8918134442
41
  - name: SENTER F Score
42
  type: f_score
43
+ value: 0.9029823331
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
  - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
+ value: 0.9186827918
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
  - name: Labeled Dependencies Accuracy
56
  type: accuracy
57
+ value: 0.9186827918
58
  ---
59
  ### Details: https://spacy.io/models/en#en_core_web_md
60
 
 
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `en_core_web_md` |
66
+ | **Version** | `3.2.0` |
67
+ | **spaCy** | `>=3.2.0,<3.3.0` |
68
  | **Default Pipeline** | `tok2vec`, `tagger`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `tagger`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 684830 keys, 20000 unique vectors (300 dimensions) |
 
72
  | **License** | `MIT` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
75
+ ### Label Scheme
76
+
77
+ <details>
78
+
79
+ <summary>View label scheme (114 labels for 4 components)</summary>
80
+
81
+ | Component | Labels |
82
+ | --- | --- |
83
+ | **`tagger`** | `$`, `''`, `,`, `-LRB-`, `-RRB-`, `.`, `:`, `ADD`, `AFX`, `CC`, `CD`, `DT`, `EX`, `FW`, `HYPH`, `IN`, `JJ`, `JJR`, `JJS`, `LS`, `MD`, `NFP`, `NN`, `NNP`, `NNPS`, `NNS`, `PDT`, `POS`, `PRP`, `PRP$`, `RB`, `RBR`, `RBS`, `RP`, `SYM`, `TO`, `UH`, `VB`, `VBD`, `VBG`, `VBN`, `VBP`, `VBZ`, `WDT`, `WP`, `WP$`, `WRB`, `XX`, ```` |
84
+ | **`parser`** | `ROOT`, `acl`, `acomp`, `advcl`, `advmod`, `agent`, `amod`, `appos`, `attr`, `aux`, `auxpass`, `case`, `cc`, `ccomp`, `compound`, `conj`, `csubj`, `csubjpass`, `dative`, `dep`, `det`, `dobj`, `expl`, `intj`, `mark`, `meta`, `neg`, `nmod`, `npadvmod`, `nsubj`, `nsubjpass`, `nummod`, `oprd`, `parataxis`, `pcomp`, `pobj`, `poss`, `preconj`, `predet`, `prep`, `prt`, `punct`, `quantmod`, `relcl`, `xcomp` |
85
+ | **`senter`** | `I`, `S` |
86
+ | **`ner`** | `CARDINAL`, `DATE`, `EVENT`, `FAC`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PRODUCT`, `QUANTITY`, `TIME`, `WORK_OF_ART` |
87
+
88
+ </details>
89
+
90
  ### Accuracy
91
+
92
  | Type | Score |
93
  | --- | --- |
94
  | `TOKEN_ACC` | 99.93 |
95
+ | `TOKEN_P` | 99.57 |
96
+ | `TOKEN_R` | 99.58 |
97
+ | `TOKEN_F` | 99.57 |
98
+ | `TAG_ACC` | 97.37 |
99
+ | `SENTS_P` | 91.44 |
100
+ | `SENTS_R` | 89.18 |
101
+ | `SENTS_F` | 90.30 |
102
  | `DEP_UAS` | 91.87 |
103
+ | `DEP_LAS` | 90.07 |
104
+ | `ENTS_P` | 85.31 |
105
+ | `ENTS_R` | 84.48 |
106
+ | `ENTS_F` | 84.89 |
 
 
 
accuracy.json CHANGED
@@ -1,327 +1,330 @@
1
  {
2
  "token_acc": 0.9993053983,
3
- "tag_acc": 0.9727831973,
4
- "dep_uas": 0.9186878782,
5
- "dep_las": 0.9005160534,
6
- "ents_p": 0.853733758,
7
- "ents_r": 0.8456530449,
8
- "ents_f": 0.8496741892,
9
- "sents_p": 0.9049104721,
10
- "sents_r": 0.8801372122,
11
- "sents_f": 0.8923519379,
12
- "speed": 9590.7931710533,
13
  "dep_las_per_type": {
14
  "prep": {
15
- "p": 0.8555038992,
16
- "r": 0.865793967,
17
- "f": 0.8606181756
18
  },
19
  "det": {
20
- "p": 0.9787355385,
21
- "r": 0.9796134714,
22
- "f": 0.9791743082
23
  },
24
  "pobj": {
25
- "p": 0.9604187748,
26
- "r": 0.9690555665,
27
- "f": 0.9647178405
28
  },
29
  "nsubj": {
30
- "p": 0.9590076472,
31
- "r": 0.9450164294,
32
- "f": 0.9519606329
33
  },
34
  "aux": {
35
- "p": 0.9815061794,
36
- "r": 0.9827294578,
37
- "f": 0.9821174377
38
  },
39
  "advmod": {
40
- "p": 0.8544032299,
41
- "r": 0.8546188794,
42
- "f": 0.854511041
43
  },
44
  "relcl": {
45
- "p": 0.762561925,
46
- "r": 0.7819303338,
47
- "f": 0.7721246865
48
  },
49
  "root": {
50
- "p": 0.9166100774,
51
- "r": 0.8904281285,
52
- "f": 0.9033294295
53
  },
54
  "xcomp": {
55
- "p": 0.8850493653,
56
- "r": 0.9009332376,
57
- "f": 0.8929206688
58
  },
59
  "amod": {
60
- "p": 0.9191229098,
61
- "r": 0.9151927438,
62
- "f": 0.9171536164
63
  },
64
  "compound": {
65
- "p": 0.9215718695,
66
- "r": 0.9312207619,
67
- "f": 0.9263711911
68
  },
69
  "poss": {
70
- "p": 0.976056338,
71
  "r": 0.9764492754,
72
- "f": 0.9762527672
73
  },
74
  "ccomp": {
75
- "p": 0.7681402723,
76
- "r": 0.8386965377,
77
- "f": 0.8018693409
78
  },
79
  "attr": {
80
- "p": 0.9009700889,
81
- "r": 0.9373423045,
82
- "f": 0.9187963726
83
  },
84
  "case": {
85
- "p": 0.9787654321,
86
  "r": 0.991991992,
87
- "f": 0.9853343276
88
  },
89
  "mark": {
90
- "p": 0.9043708609,
91
- "r": 0.9046104928,
92
- "f": 0.904490661
93
  },
94
  "intj": {
95
- "p": 0.6716891356,
96
- "r": 0.6205128205,
97
- "f": 0.6450875857
98
  },
99
  "advcl": {
100
- "p": 0.668953252,
101
- "r": 0.6630571644,
102
- "f": 0.6659921588
103
  },
104
  "cc": {
105
- "p": 0.8354582632,
106
- "r": 0.8307618706,
107
- "f": 0.8331034483
108
  },
109
  "neg": {
110
- "p": 0.9481296758,
111
- "r": 0.9538384345,
112
- "f": 0.9509754877
113
  },
114
  "conj": {
115
- "p": 0.763488544,
116
- "r": 0.7802114804,
117
- "f": 0.7717594322
118
  },
119
  "nsubjpass": {
120
- "p": 0.923991727,
121
- "r": 0.9164102564,
122
- "f": 0.9201853759
123
  },
124
  "auxpass": {
125
- "p": 0.9489342806,
126
- "r": 0.9735763098,
127
- "f": 0.961097369
128
  },
129
  "dobj": {
130
- "p": 0.9222507588,
131
- "r": 0.9442983505,
132
- "f": 0.9331443421
133
  },
134
  "nummod": {
135
- "p": 0.9328073301,
136
- "r": 0.9255050505,
137
- "f": 0.9291418431
138
  },
139
  "npadvmod": {
140
- "p": 0.7844106464,
141
- "r": 0.7328596803,
142
- "f": 0.7577594123
143
  },
144
  "prt": {
145
- "p": 0.816072908,
146
- "r": 0.8826164875,
147
- "f": 0.8480413259
148
  },
149
  "pcomp": {
150
- "p": 0.8836720392,
151
- "r": 0.8830532213,
152
- "f": 0.8833625219
153
  },
154
  "expl": {
155
- "p": 0.9809322034,
156
  "r": 0.9914346895,
157
- "f": 0.9861554846
158
  },
159
  "acl": {
160
- "p": 0.7393586006,
161
- "r": 0.6917621386,
162
- "f": 0.7147688839
163
  },
164
  "agent": {
165
- "p": 0.9043478261,
166
- "r": 0.9318996416,
167
- "f": 0.9179170344
168
  },
169
  "dative": {
170
- "p": 0.7763496144,
171
- "r": 0.6926605505,
172
- "f": 0.7321212121
173
  },
174
  "acomp": {
175
- "p": 0.9131627057,
176
- "r": 0.906122449,
177
- "f": 0.9096289552
178
  },
179
  "dep": {
180
- "p": 0.3927125506,
181
- "r": 0.1574675325,
182
- "f": 0.224797219
183
  },
184
  "csubj": {
185
- "p": 0.6436781609,
186
- "r": 0.6627218935,
187
- "f": 0.6530612245
188
  },
189
  "quantmod": {
190
- "p": 0.8633093525,
191
- "r": 0.7798537774,
192
- "f": 0.8194622279
193
  },
194
  "nmod": {
195
- "p": 0.7863599014,
196
- "r": 0.5831809872,
197
- "f": 0.6696990903
198
  },
199
  "appos": {
200
- "p": 0.6757117438,
201
- "r": 0.6590021692,
202
- "f": 0.6672523611
203
  },
204
  "predet": {
205
- "p": 0.8582995951,
206
- "r": 0.9098712446,
207
- "f": 0.8833333333
208
  },
209
  "preconj": {
210
- "p": 0.5333333333,
211
- "r": 0.6511627907,
212
- "f": 0.5863874346
213
  },
214
  "oprd": {
215
- "p": 0.8266666667,
216
- "r": 0.7402985075,
217
- "f": 0.7811023622
218
  },
219
  "parataxis": {
220
- "p": 0.6164383562,
221
- "r": 0.4880694143,
222
- "f": 0.5447941889
223
  },
224
  "meta": {
225
- "p": 0.9047619048,
226
- "r": 0.3653846154,
227
- "f": 0.5205479452
228
  },
229
  "csubjpass": {
230
- "p": 0.625,
231
- "r": 0.8333333333,
232
- "f": 0.7142857143
233
  }
234
  },
 
 
 
235
  "ents_per_type": {
236
  "DATE": {
237
- "p": 0.8675308252,
238
- "r": 0.8711111111,
239
- "f": 0.8693172818
240
  },
241
  "GPE": {
242
- "p": 0.9202037351,
243
  "r": 0.9071129707,
244
- "f": 0.9136114623
245
  },
246
  "ORDINAL": {
247
- "p": 0.7936962751,
248
- "r": 0.8602484472,
249
- "f": 0.825633383
 
 
 
 
 
250
  },
251
  "ORG": {
252
- "p": 0.8129760967,
253
- "r": 0.8205196182,
254
- "f": 0.8167304394
255
  },
256
  "QUANTITY": {
257
- "p": 0.8082191781,
258
- "r": 0.6483516484,
259
- "f": 0.7195121951
260
- },
261
- "LOC": {
262
- "p": 0.6907216495,
263
- "r": 0.6401273885,
264
- "f": 0.6644628099
265
  },
266
  "CARDINAL": {
267
- "p": 0.8169897377,
268
- "r": 0.8519619501,
269
- "f": 0.8341094296
270
- },
271
- "PERSON": {
272
- "p": 0.8785310734,
273
- "r": 0.9135117493,
274
- "f": 0.89568
275
  },
276
  "NORP": {
277
- "p": 0.9040322581,
278
- "r": 0.8968,
279
- "f": 0.9004016064
280
  },
281
- "PRODUCT": {
282
- "p": 0.6276595745,
283
- "r": 0.2796208531,
284
- "f": 0.3868852459
285
  },
286
  "FAC": {
287
- "p": 0.4297520661,
288
- "r": 0.4,
289
- "f": 0.4143426295
290
- },
291
- "MONEY": {
292
- "p": 0.9168674699,
293
- "r": 0.8984651712,
294
- "f": 0.9075730471
295
  },
296
  "TIME": {
297
- "p": 0.7267267267,
298
- "r": 0.7076023392,
299
- "f": 0.717037037
300
  },
301
- "WORK_OF_ART": {
302
- "p": 0.4122137405,
303
- "r": 0.2783505155,
304
- "f": 0.3323076923
 
 
 
 
 
305
  },
306
  "EVENT": {
307
- "p": 0.6162790698,
308
- "r": 0.3045977011,
309
- "f": 0.4076923077
 
 
 
 
 
310
  },
311
  "LAW": {
312
- "p": 0.4655172414,
313
- "r": 0.421875,
314
- "f": 0.4426229508
315
  },
316
  "PERCENT": {
317
- "p": 0.9189189189,
318
- "r": 0.8851454824,
319
- "f": 0.9017160686
320
  },
321
  "LANGUAGE": {
322
- "p": 0.7407407407,
323
- "r": 0.625,
324
- "f": 0.6779661017
325
  }
326
- }
 
327
  }
 
1
  {
2
  "token_acc": 0.9993053983,
3
+ "token_p": 0.9956742163,
4
+ "token_r": 0.9957505887,
5
+ "token_f": 0.9957124011,
6
+ "tag_acc": 0.9736958159,
7
+ "sents_p": 0.9144345238,
8
+ "sents_r": 0.8918134442,
9
+ "sents_f": 0.9029823331,
10
+ "dep_uas": 0.9186827918,
11
+ "dep_las": 0.9006556195,
 
12
  "dep_las_per_type": {
13
  "prep": {
14
+ "p": 0.8569122175,
15
+ "r": 0.8659836843,
16
+ "f": 0.8614240691
17
  },
18
  "det": {
19
+ "p": 0.9770765472,
20
+ "r": 0.9784310528,
21
+ "f": 0.9777533309
22
  },
23
  "pobj": {
24
+ "p": 0.9611128429,
25
+ "r": 0.968623601,
26
+ "f": 0.9648536056
27
  },
28
  "nsubj": {
29
+ "p": 0.9594312375,
30
+ "r": 0.9459802848,
31
+ "f": 0.9526582837
32
  },
33
  "aux": {
34
+ "p": 0.9797621161,
35
+ "r": 0.9826404344,
36
+ "f": 0.9811991644
37
  },
38
  "advmod": {
39
+ "p": 0.8561672709,
40
+ "r": 0.8543664816,
41
+ "f": 0.8552659283
42
  },
43
  "relcl": {
44
+ "p": 0.765480427,
45
+ "r": 0.780478955,
46
+ "f": 0.772906935
47
  },
48
  "root": {
49
+ "p": 0.9166215118,
50
+ "r": 0.8927369879,
51
+ "f": 0.9045216055
52
  },
53
  "xcomp": {
54
+ "p": 0.8828097423,
55
+ "r": 0.8977027997,
56
+ "f": 0.8901939847
57
  },
58
  "amod": {
59
+ "p": 0.92090506,
60
+ "r": 0.9149983803,
61
+ "f": 0.9179422183
62
  },
63
  "compound": {
64
+ "p": 0.917950968,
65
+ "r": 0.9321118289,
66
+ "f": 0.924977203
67
  },
68
  "poss": {
69
+ "p": 0.9744877461,
70
  "r": 0.9764492754,
71
+ "f": 0.9754675246
72
  },
73
  "ccomp": {
74
+ "p": 0.7754030746,
75
+ "r": 0.8423625255,
76
+ "f": 0.8074970715
77
  },
78
  "attr": {
79
+ "p": 0.8974979822,
80
+ "r": 0.9352396972,
81
+ "f": 0.9159802306
82
  },
83
  "case": {
84
+ "p": 0.9811881188,
85
  "r": 0.991991992,
86
+ "f": 0.9865604778
87
  },
88
  "mark": {
89
+ "p": 0.9043686734,
90
+ "r": 0.8995760466,
91
+ "f": 0.9019659936
92
  },
93
  "intj": {
94
+ "p": 0.6650717703,
95
+ "r": 0.610989011,
96
+ "f": 0.636884307
97
  },
98
  "advcl": {
99
+ "p": 0.6723033564,
100
+ "r": 0.6607907328,
101
+ "f": 0.666497333
102
  },
103
  "cc": {
104
+ "p": 0.835978836,
105
+ "r": 0.8314794881,
106
+ "f": 0.8337230917
107
  },
108
  "neg": {
109
+ "p": 0.9431988042,
110
+ "r": 0.9498243853,
111
+ "f": 0.9465
112
  },
113
  "conj": {
114
+ "p": 0.7615497433,
115
+ "r": 0.7843655589,
116
+ "f": 0.7727892844
117
  },
118
  "nsubjpass": {
119
+ "p": 0.9269311065,
120
+ "r": 0.9107692308,
121
+ "f": 0.9187790998
122
  },
123
  "auxpass": {
124
+ "p": 0.9508050089,
125
+ "r": 0.9685649203,
126
+ "f": 0.9596027985
127
  },
128
  "dobj": {
129
+ "p": 0.9220839813,
130
+ "r": 0.9449358515,
131
+ "f": 0.9333700657
132
  },
133
  "nummod": {
134
+ "p": 0.9399338254,
135
+ "r": 0.9325757576,
136
+ "f": 0.9362403346
137
  },
138
  "npadvmod": {
139
+ "p": 0.7793445122,
140
+ "r": 0.7264653641,
141
+ "f": 0.7519764663
142
  },
143
  "prt": {
144
+ "p": 0.8145094806,
145
+ "r": 0.8853046595,
146
+ "f": 0.8484328038
147
  },
148
  "pcomp": {
149
+ "p": 0.8889679715,
150
+ "r": 0.8746498599,
151
+ "f": 0.8817507942
152
  },
153
  "expl": {
154
+ "p": 0.983014862,
155
  "r": 0.9914346895,
156
+ "f": 0.987206823
157
  },
158
  "acl": {
159
+ "p": 0.7449741528,
160
+ "r": 0.7075831969,
161
+ "f": 0.7257974259
162
  },
163
  "agent": {
164
+ "p": 0.8957264957,
165
+ "r": 0.9390681004,
166
+ "f": 0.9168853893
167
  },
168
  "dative": {
169
+ "p": 0.7732997481,
170
+ "r": 0.7041284404,
171
+ "f": 0.7370948379
172
  },
173
  "acomp": {
174
+ "p": 0.9094236048,
175
+ "r": 0.9015873016,
176
+ "f": 0.9054884992
177
  },
178
  "dep": {
179
+ "p": 0.3909465021,
180
+ "r": 0.1542207792,
181
+ "f": 0.2211874272
182
  },
183
  "csubj": {
184
+ "p": 0.8098591549,
185
+ "r": 0.6804733728,
186
+ "f": 0.7395498392
187
  },
188
  "quantmod": {
189
+ "p": 0.8739800544,
190
+ "r": 0.7831031682,
191
+ "f": 0.8260497001
192
  },
193
  "nmod": {
194
+ "p": 0.7614457831,
195
+ "r": 0.5776965265,
196
+ "f": 0.656964657
197
  },
198
  "appos": {
199
+ "p": 0.6850678733,
200
+ "r": 0.6568329718,
201
+ "f": 0.6706533776
202
  },
203
  "predet": {
204
+ "p": 0.8467741935,
205
+ "r": 0.9012875536,
206
+ "f": 0.8731808732
207
  },
208
  "preconj": {
209
+ "p": 0.5454545455,
210
+ "r": 0.6279069767,
211
+ "f": 0.5837837838
212
  },
213
  "oprd": {
214
+ "p": 0.8413793103,
215
+ "r": 0.728358209,
216
+ "f": 0.7808
217
  },
218
  "parataxis": {
219
+ "p": 0.6129943503,
220
+ "r": 0.4707158351,
221
+ "f": 0.5325153374
222
  },
223
  "meta": {
224
+ "p": 0.8,
225
+ "r": 0.3076923077,
226
+ "f": 0.4444444444
227
  },
228
  "csubjpass": {
229
+ "p": 0.5714285714,
230
+ "r": 0.6666666667,
231
+ "f": 0.6153846154
232
  }
233
  },
234
+ "ents_p": 0.8531330602,
235
+ "ents_r": 0.8448016827,
236
+ "ents_f": 0.8489469314,
237
  "ents_per_type": {
238
  "DATE": {
239
+ "p": 0.8645998102,
240
+ "r": 0.8676190476,
241
+ "f": 0.8661067977
242
  },
243
  "GPE": {
244
+ "p": 0.9183846371,
245
  "r": 0.9071129707,
246
+ "f": 0.9127140051
247
  },
248
  "ORDINAL": {
249
+ "p": 0.7765363128,
250
+ "r": 0.8633540373,
251
+ "f": 0.8176470588
252
+ },
253
+ "PERSON": {
254
+ "p": 0.8805737449,
255
+ "r": 0.9216710183,
256
+ "f": 0.9006538032
257
  },
258
  "ORG": {
259
+ "p": 0.8025329543,
260
+ "r": 0.8231707317,
261
+ "f": 0.8127208481
262
  },
263
  "QUANTITY": {
264
+ "p": 0.7697841727,
265
+ "r": 0.5879120879,
266
+ "f": 0.6666666667
 
 
 
 
 
267
  },
268
  "CARDINAL": {
269
+ "p": 0.8279202279,
270
+ "r": 0.8638525565,
271
+ "f": 0.8455048007
 
 
 
 
 
272
  },
273
  "NORP": {
274
+ "p": 0.9102667745,
275
+ "r": 0.9008,
276
+ "f": 0.905508645
277
  },
278
+ "LOC": {
279
+ "p": 0.7022058824,
280
+ "r": 0.6082802548,
281
+ "f": 0.6518771331
282
  },
283
  "FAC": {
284
+ "p": 0.4122807018,
285
+ "r": 0.3615384615,
286
+ "f": 0.3852459016
 
 
 
 
 
287
  },
288
  "TIME": {
289
+ "p": 0.7450980392,
290
+ "r": 0.6666666667,
291
+ "f": 0.7037037037
292
  },
293
+ "PRODUCT": {
294
+ "p": 0.6376811594,
295
+ "r": 0.2085308057,
296
+ "f": 0.3142857143
297
+ },
298
+ "MONEY": {
299
+ "p": 0.9027611044,
300
+ "r": 0.8878394333,
301
+ "f": 0.8952380952
302
  },
303
  "EVENT": {
304
+ "p": 0.6043956044,
305
+ "r": 0.316091954,
306
+ "f": 0.4150943396
307
+ },
308
+ "WORK_OF_ART": {
309
+ "p": 0.5317460317,
310
+ "r": 0.3453608247,
311
+ "f": 0.41875
312
  },
313
  "LAW": {
314
+ "p": 0.4666666667,
315
+ "r": 0.328125,
316
+ "f": 0.3853211009
317
  },
318
  "PERCENT": {
319
+ "p": 0.9090909091,
320
+ "r": 0.8728943338,
321
+ "f": 0.890625
322
  },
323
  "LANGUAGE": {
324
+ "p": 0.6956521739,
325
+ "r": 0.5,
326
+ "f": 0.5818181818
327
  }
328
+ },
329
+ "speed": 7620.1455610511
330
  }
attribute_ruler/patterns CHANGED
Binary files a/attribute_ruler/patterns and b/attribute_ruler/patterns differ
 
config.cfg CHANGED
@@ -1,10 +1,8 @@
1
  [paths]
2
- train = "corpus/en-core-web/train.spacy"
3
- dev = "corpus/en-core-web/dev.spacy"
4
- vectors = "corpus/en_vectors"
5
- raw = null
6
  init_tok2vec = null
7
- vocab_data = null
8
 
9
  [system]
10
  gpu_allocator = null
@@ -24,6 +22,7 @@ tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
24
 
25
  [components.attribute_ruler]
26
  factory = "attribute_ruler"
 
27
  validate = false
28
 
29
  [components.lemmatizer]
@@ -31,11 +30,13 @@ factory = "lemmatizer"
31
  mode = "rule"
32
  model = null
33
  overwrite = false
 
34
 
35
  [components.ner]
36
  factory = "ner"
37
  incorrect_spans_key = null
38
  moves = null
 
39
  update_with_oracle_cut_size = 100
40
 
41
  [components.ner.model]
@@ -53,8 +54,8 @@ nO = null
53
  [components.ner.model.tok2vec.embed]
54
  @architectures = "spacy.MultiHashEmbed.v2"
55
  width = 96
56
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
57
- rows = [5000,2500,2500,2500]
58
  include_static_vectors = true
59
 
60
  [components.ner.model.tok2vec.encode]
@@ -69,6 +70,7 @@ factory = "parser"
69
  learn_tokens = false
70
  min_action_freq = 30
71
  moves = null
 
72
  update_with_oracle_cut_size = 100
73
 
74
  [components.parser.model]
@@ -87,6 +89,8 @@ upstream = "tok2vec"
87
 
88
  [components.senter]
89
  factory = "senter"
 
 
90
 
91
  [components.senter.model]
92
  @architectures = "spacy.Tagger.v1"
@@ -98,8 +102,8 @@ nO = null
98
  [components.senter.model.tok2vec.embed]
99
  @architectures = "spacy.MultiHashEmbed.v2"
100
  width = 16
101
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
102
- rows = [1000,500,500,500]
103
  include_static_vectors = true
104
 
105
  [components.senter.model.tok2vec.encode]
@@ -111,6 +115,8 @@ maxout_pieces = 2
111
 
112
  [components.tagger]
113
  factory = "tagger"
 
 
114
 
115
  [components.tagger.model]
116
  @architectures = "spacy.Tagger.v1"
@@ -130,8 +136,8 @@ factory = "tok2vec"
130
  [components.tok2vec.model.embed]
131
  @architectures = "spacy.MultiHashEmbed.v2"
132
  width = ${components.tok2vec.model.encode:width}
133
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
134
- rows = [5000,2500,2500,2500]
135
  include_static_vectors = true
136
 
137
  [components.tok2vec.model.encode]
@@ -145,27 +151,19 @@ maxout_pieces = 3
145
 
146
  [corpora.dev]
147
  @readers = "spacy.Corpus.v1"
148
- limit = 0
149
- max_length = 0
150
- path = ${paths:dev}
151
  gold_preproc = false
 
 
152
  augmenter = null
153
 
154
  [corpora.train]
155
  @readers = "spacy.Corpus.v1"
156
- path = ${paths:train}
157
- max_length = 5000
158
  gold_preproc = false
 
159
  limit = 0
160
-
161
- [corpora.train.augmenter]
162
- @augmenters = "spacy.orth_variants.v1"
163
- level = 0.2
164
- lower = 0.5
165
-
166
- [corpora.train.augmenter.orth_variants]
167
- @readers = "srsly.read_json.v1"
168
- path = "assets/orth_variants.json"
169
 
170
  [training]
171
  train_corpus = "corpora.train"
@@ -196,9 +194,8 @@ compound = 1.001
196
  t = 0.0
197
 
198
  [training.logger]
199
- @loggers = "spacy.WandbLogger.v1"
200
- project_name = "spacy-v3.0.0a2"
201
- remove_config_values = []
202
 
203
  [training.optimizer]
204
  @optimizers = "Adam.v1"
@@ -219,16 +216,17 @@ dep_las_per_type = null
219
  sents_p = null
220
  sents_r = null
221
  sents_f = 0.02
222
- lemma_acc = 0.33
223
- ents_f = 0.33
224
  ents_p = 0.0
225
  ents_r = 0.0
226
  ents_per_type = null
 
227
 
228
  [pretraining]
229
 
230
  [initialize]
231
- vocab_data = ${paths.vocab_data}
232
  vectors = ${paths.vectors}
233
  init_tok2vec = ${paths.init_tok2vec}
234
  before_init = null
 
1
  [paths]
2
+ train = null
3
+ dev = null
4
+ vectors = null
 
5
  init_tok2vec = null
 
6
 
7
  [system]
8
  gpu_allocator = null
 
22
 
23
  [components.attribute_ruler]
24
  factory = "attribute_ruler"
25
+ scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
26
  validate = false
27
 
28
  [components.lemmatizer]
 
30
  mode = "rule"
31
  model = null
32
  overwrite = false
33
+ scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
34
 
35
  [components.ner]
36
  factory = "ner"
37
  incorrect_spans_key = null
38
  moves = null
39
+ scorer = {"@scorers":"spacy.ner_scorer.v1"}
40
  update_with_oracle_cut_size = 100
41
 
42
  [components.ner.model]
 
54
  [components.ner.model.tok2vec.embed]
55
  @architectures = "spacy.MultiHashEmbed.v2"
56
  width = 96
57
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
58
+ rows = [5000,2500,2500,2500,100]
59
  include_static_vectors = true
60
 
61
  [components.ner.model.tok2vec.encode]
 
70
  learn_tokens = false
71
  min_action_freq = 30
72
  moves = null
73
+ scorer = {"@scorers":"spacy.parser_scorer.v1"}
74
  update_with_oracle_cut_size = 100
75
 
76
  [components.parser.model]
 
89
 
90
  [components.senter]
91
  factory = "senter"
92
+ overwrite = false
93
+ scorer = {"@scorers":"spacy.senter_scorer.v1"}
94
 
95
  [components.senter.model]
96
  @architectures = "spacy.Tagger.v1"
 
102
  [components.senter.model.tok2vec.embed]
103
  @architectures = "spacy.MultiHashEmbed.v2"
104
  width = 16
105
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
106
+ rows = [1000,500,500,500,50]
107
  include_static_vectors = true
108
 
109
  [components.senter.model.tok2vec.encode]
 
115
 
116
  [components.tagger]
117
  factory = "tagger"
118
+ overwrite = false
119
+ scorer = {"@scorers":"spacy.tagger_scorer.v1"}
120
 
121
  [components.tagger.model]
122
  @architectures = "spacy.Tagger.v1"
 
136
  [components.tok2vec.model.embed]
137
  @architectures = "spacy.MultiHashEmbed.v2"
138
  width = ${components.tok2vec.model.encode:width}
139
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
140
+ rows = [5000,2500,2500,2500,100]
141
  include_static_vectors = true
142
 
143
  [components.tok2vec.model.encode]
 
151
 
152
  [corpora.dev]
153
  @readers = "spacy.Corpus.v1"
154
+ path = ${paths.dev}
 
 
155
  gold_preproc = false
156
+ max_length = 0
157
+ limit = 0
158
  augmenter = null
159
 
160
  [corpora.train]
161
  @readers = "spacy.Corpus.v1"
162
+ path = ${paths.train}
 
163
  gold_preproc = false
164
+ max_length = 0
165
  limit = 0
166
+ augmenter = null
 
 
 
 
 
 
 
 
167
 
168
  [training]
169
  train_corpus = "corpora.train"
 
194
  t = 0.0
195
 
196
  [training.logger]
197
+ @loggers = "spacy.ConsoleLogger.v1"
198
+ progress_bar = false
 
199
 
200
  [training.optimizer]
201
  @optimizers = "Adam.v1"
 
216
  sents_p = null
217
  sents_r = null
218
  sents_f = 0.02
219
+ lemma_acc = 0.5
220
+ ents_f = 0.16
221
  ents_p = 0.0
222
  ents_r = 0.0
223
  ents_per_type = null
224
+ speed = 0.0
225
 
226
  [pretraining]
227
 
228
  [initialize]
229
+ vocab_data = null
230
  vectors = ${paths.vectors}
231
  init_tok2vec = ${paths.init_tok2vec}
232
  before_init = null
en_core_web_md-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:45971f672a6312d493c0703dc4deeddde6e4f81ba98b626c8abf67b4bef720c7
3
- size 45386139
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9728ab892904819567e70720a3887b59ded8721f14cc27b3d13273ad2b8ba458
3
+ size 45684441
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"en",
3
  "name":"core_web_md",
4
- "version":"3.1.0",
5
  "description":"English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"MIT",
10
- "spacy_version":">=3.1.0,<3.2.0",
11
- "spacy_git_version":"caba63b74",
12
  "vectors":{
13
  "width":300,
14
  "vectors":20000,
@@ -170,330 +170,333 @@
170
  ],
171
  "performance":{
172
  "token_acc":0.9993053983,
173
- "tag_acc":0.9727831973,
174
- "dep_uas":0.9186878782,
175
- "dep_las":0.9005160534,
176
- "ents_p":0.853733758,
177
- "ents_r":0.8456530449,
178
- "ents_f":0.8496741892,
179
- "sents_p":0.9049104721,
180
- "sents_r":0.8801372122,
181
- "sents_f":0.8923519379,
182
- "speed":9590.7931710533,
183
  "dep_las_per_type":{
184
  "prep":{
185
- "p":0.8555038992,
186
- "r":0.865793967,
187
- "f":0.8606181756
188
  },
189
  "det":{
190
- "p":0.9787355385,
191
- "r":0.9796134714,
192
- "f":0.9791743082
193
  },
194
  "pobj":{
195
- "p":0.9604187748,
196
- "r":0.9690555665,
197
- "f":0.9647178405
198
  },
199
  "nsubj":{
200
- "p":0.9590076472,
201
- "r":0.9450164294,
202
- "f":0.9519606329
203
  },
204
  "aux":{
205
- "p":0.9815061794,
206
- "r":0.9827294578,
207
- "f":0.9821174377
208
  },
209
  "advmod":{
210
- "p":0.8544032299,
211
- "r":0.8546188794,
212
- "f":0.854511041
213
  },
214
  "relcl":{
215
- "p":0.762561925,
216
- "r":0.7819303338,
217
- "f":0.7721246865
218
  },
219
  "root":{
220
- "p":0.9166100774,
221
- "r":0.8904281285,
222
- "f":0.9033294295
223
  },
224
  "xcomp":{
225
- "p":0.8850493653,
226
- "r":0.9009332376,
227
- "f":0.8929206688
228
  },
229
  "amod":{
230
- "p":0.9191229098,
231
- "r":0.9151927438,
232
- "f":0.9171536164
233
  },
234
  "compound":{
235
- "p":0.9215718695,
236
- "r":0.9312207619,
237
- "f":0.9263711911
238
  },
239
  "poss":{
240
- "p":0.976056338,
241
  "r":0.9764492754,
242
- "f":0.9762527672
243
  },
244
  "ccomp":{
245
- "p":0.7681402723,
246
- "r":0.8386965377,
247
- "f":0.8018693409
248
  },
249
  "attr":{
250
- "p":0.9009700889,
251
- "r":0.9373423045,
252
- "f":0.9187963726
253
  },
254
  "case":{
255
- "p":0.9787654321,
256
  "r":0.991991992,
257
- "f":0.9853343276
258
  },
259
  "mark":{
260
- "p":0.9043708609,
261
- "r":0.9046104928,
262
- "f":0.904490661
263
  },
264
  "intj":{
265
- "p":0.6716891356,
266
- "r":0.6205128205,
267
- "f":0.6450875857
268
  },
269
  "advcl":{
270
- "p":0.668953252,
271
- "r":0.6630571644,
272
- "f":0.6659921588
273
  },
274
  "cc":{
275
- "p":0.8354582632,
276
- "r":0.8307618706,
277
- "f":0.8331034483
278
  },
279
  "neg":{
280
- "p":0.9481296758,
281
- "r":0.9538384345,
282
- "f":0.9509754877
283
  },
284
  "conj":{
285
- "p":0.763488544,
286
- "r":0.7802114804,
287
- "f":0.7717594322
288
  },
289
  "nsubjpass":{
290
- "p":0.923991727,
291
- "r":0.9164102564,
292
- "f":0.9201853759
293
  },
294
  "auxpass":{
295
- "p":0.9489342806,
296
- "r":0.9735763098,
297
- "f":0.961097369
298
  },
299
  "dobj":{
300
- "p":0.9222507588,
301
- "r":0.9442983505,
302
- "f":0.9331443421
303
  },
304
  "nummod":{
305
- "p":0.9328073301,
306
- "r":0.9255050505,
307
- "f":0.9291418431
308
  },
309
  "npadvmod":{
310
- "p":0.7844106464,
311
- "r":0.7328596803,
312
- "f":0.7577594123
313
  },
314
  "prt":{
315
- "p":0.816072908,
316
- "r":0.8826164875,
317
- "f":0.8480413259
318
  },
319
  "pcomp":{
320
- "p":0.8836720392,
321
- "r":0.8830532213,
322
- "f":0.8833625219
323
  },
324
  "expl":{
325
- "p":0.9809322034,
326
  "r":0.9914346895,
327
- "f":0.9861554846
328
  },
329
  "acl":{
330
- "p":0.7393586006,
331
- "r":0.6917621386,
332
- "f":0.7147688839
333
  },
334
  "agent":{
335
- "p":0.9043478261,
336
- "r":0.9318996416,
337
- "f":0.9179170344
338
  },
339
  "dative":{
340
- "p":0.7763496144,
341
- "r":0.6926605505,
342
- "f":0.7321212121
343
  },
344
  "acomp":{
345
- "p":0.9131627057,
346
- "r":0.906122449,
347
- "f":0.9096289552
348
  },
349
  "dep":{
350
- "p":0.3927125506,
351
- "r":0.1574675325,
352
- "f":0.224797219
353
  },
354
  "csubj":{
355
- "p":0.6436781609,
356
- "r":0.6627218935,
357
- "f":0.6530612245
358
  },
359
  "quantmod":{
360
- "p":0.8633093525,
361
- "r":0.7798537774,
362
- "f":0.8194622279
363
  },
364
  "nmod":{
365
- "p":0.7863599014,
366
- "r":0.5831809872,
367
- "f":0.6696990903
368
  },
369
  "appos":{
370
- "p":0.6757117438,
371
- "r":0.6590021692,
372
- "f":0.6672523611
373
  },
374
  "predet":{
375
- "p":0.8582995951,
376
- "r":0.9098712446,
377
- "f":0.8833333333
378
  },
379
  "preconj":{
380
- "p":0.5333333333,
381
- "r":0.6511627907,
382
- "f":0.5863874346
383
  },
384
  "oprd":{
385
- "p":0.8266666667,
386
- "r":0.7402985075,
387
- "f":0.7811023622
388
  },
389
  "parataxis":{
390
- "p":0.6164383562,
391
- "r":0.4880694143,
392
- "f":0.5447941889
393
  },
394
  "meta":{
395
- "p":0.9047619048,
396
- "r":0.3653846154,
397
- "f":0.5205479452
398
  },
399
  "csubjpass":{
400
- "p":0.625,
401
- "r":0.8333333333,
402
- "f":0.7142857143
403
  }
404
  },
 
 
 
405
  "ents_per_type":{
406
  "DATE":{
407
- "p":0.8675308252,
408
- "r":0.8711111111,
409
- "f":0.8693172818
410
  },
411
  "GPE":{
412
- "p":0.9202037351,
413
  "r":0.9071129707,
414
- "f":0.9136114623
415
  },
416
  "ORDINAL":{
417
- "p":0.7936962751,
418
- "r":0.8602484472,
419
- "f":0.825633383
 
 
 
 
 
420
  },
421
  "ORG":{
422
- "p":0.8129760967,
423
- "r":0.8205196182,
424
- "f":0.8167304394
425
  },
426
  "QUANTITY":{
427
- "p":0.8082191781,
428
- "r":0.6483516484,
429
- "f":0.7195121951
430
- },
431
- "LOC":{
432
- "p":0.6907216495,
433
- "r":0.6401273885,
434
- "f":0.6644628099
435
  },
436
  "CARDINAL":{
437
- "p":0.8169897377,
438
- "r":0.8519619501,
439
- "f":0.8341094296
440
- },
441
- "PERSON":{
442
- "p":0.8785310734,
443
- "r":0.9135117493,
444
- "f":0.89568
445
  },
446
  "NORP":{
447
- "p":0.9040322581,
448
- "r":0.8968,
449
- "f":0.9004016064
450
  },
451
- "PRODUCT":{
452
- "p":0.6276595745,
453
- "r":0.2796208531,
454
- "f":0.3868852459
455
  },
456
  "FAC":{
457
- "p":0.4297520661,
458
- "r":0.4,
459
- "f":0.4143426295
460
- },
461
- "MONEY":{
462
- "p":0.9168674699,
463
- "r":0.8984651712,
464
- "f":0.9075730471
465
  },
466
  "TIME":{
467
- "p":0.7267267267,
468
- "r":0.7076023392,
469
- "f":0.717037037
470
  },
471
- "WORK_OF_ART":{
472
- "p":0.4122137405,
473
- "r":0.2783505155,
474
- "f":0.3323076923
 
 
 
 
 
475
  },
476
  "EVENT":{
477
- "p":0.6162790698,
478
- "r":0.3045977011,
479
- "f":0.4076923077
 
 
 
 
 
480
  },
481
  "LAW":{
482
- "p":0.4655172414,
483
- "r":0.421875,
484
- "f":0.4426229508
485
  },
486
  "PERCENT":{
487
- "p":0.9189189189,
488
- "r":0.8851454824,
489
- "f":0.9017160686
490
  },
491
  "LANGUAGE":{
492
- "p":0.7407407407,
493
- "r":0.625,
494
- "f":0.6779661017
495
  }
496
- }
 
497
  },
498
  "sources":[
499
  {
 
1
  {
2
  "lang":"en",
3
  "name":"core_web_md",
4
+ "version":"3.2.0",
5
  "description":"English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"MIT",
10
+ "spacy_version":">=3.2.0,<3.3.0",
11
+ "spacy_git_version":"bb26550e2",
12
  "vectors":{
13
  "width":300,
14
  "vectors":20000,
 
170
  ],
171
  "performance":{
172
  "token_acc":0.9993053983,
173
+ "token_p":0.9956742163,
174
+ "token_r":0.9957505887,
175
+ "token_f":0.9957124011,
176
+ "tag_acc":0.9736958159,
177
+ "sents_p":0.9144345238,
178
+ "sents_r":0.8918134442,
179
+ "sents_f":0.9029823331,
180
+ "dep_uas":0.9186827918,
181
+ "dep_las":0.9006556195,
 
182
  "dep_las_per_type":{
183
  "prep":{
184
+ "p":0.8569122175,
185
+ "r":0.8659836843,
186
+ "f":0.8614240691
187
  },
188
  "det":{
189
+ "p":0.9770765472,
190
+ "r":0.9784310528,
191
+ "f":0.9777533309
192
  },
193
  "pobj":{
194
+ "p":0.9611128429,
195
+ "r":0.968623601,
196
+ "f":0.9648536056
197
  },
198
  "nsubj":{
199
+ "p":0.9594312375,
200
+ "r":0.9459802848,
201
+ "f":0.9526582837
202
  },
203
  "aux":{
204
+ "p":0.9797621161,
205
+ "r":0.9826404344,
206
+ "f":0.9811991644
207
  },
208
  "advmod":{
209
+ "p":0.8561672709,
210
+ "r":0.8543664816,
211
+ "f":0.8552659283
212
  },
213
  "relcl":{
214
+ "p":0.765480427,
215
+ "r":0.780478955,
216
+ "f":0.772906935
217
  },
218
  "root":{
219
+ "p":0.9166215118,
220
+ "r":0.8927369879,
221
+ "f":0.9045216055
222
  },
223
  "xcomp":{
224
+ "p":0.8828097423,
225
+ "r":0.8977027997,
226
+ "f":0.8901939847
227
  },
228
  "amod":{
229
+ "p":0.92090506,
230
+ "r":0.9149983803,
231
+ "f":0.9179422183
232
  },
233
  "compound":{
234
+ "p":0.917950968,
235
+ "r":0.9321118289,
236
+ "f":0.924977203
237
  },
238
  "poss":{
239
+ "p":0.9744877461,
240
  "r":0.9764492754,
241
+ "f":0.9754675246
242
  },
243
  "ccomp":{
244
+ "p":0.7754030746,
245
+ "r":0.8423625255,
246
+ "f":0.8074970715
247
  },
248
  "attr":{
249
+ "p":0.8974979822,
250
+ "r":0.9352396972,
251
+ "f":0.9159802306
252
  },
253
  "case":{
254
+ "p":0.9811881188,
255
  "r":0.991991992,
256
+ "f":0.9865604778
257
  },
258
  "mark":{
259
+ "p":0.9043686734,
260
+ "r":0.8995760466,
261
+ "f":0.9019659936
262
  },
263
  "intj":{
264
+ "p":0.6650717703,
265
+ "r":0.610989011,
266
+ "f":0.636884307
267
  },
268
  "advcl":{
269
+ "p":0.6723033564,
270
+ "r":0.6607907328,
271
+ "f":0.666497333
272
  },
273
  "cc":{
274
+ "p":0.835978836,
275
+ "r":0.8314794881,
276
+ "f":0.8337230917
277
  },
278
  "neg":{
279
+ "p":0.9431988042,
280
+ "r":0.9498243853,
281
+ "f":0.9465
282
  },
283
  "conj":{
284
+ "p":0.7615497433,
285
+ "r":0.7843655589,
286
+ "f":0.7727892844
287
  },
288
  "nsubjpass":{
289
+ "p":0.9269311065,
290
+ "r":0.9107692308,
291
+ "f":0.9187790998
292
  },
293
  "auxpass":{
294
+ "p":0.9508050089,
295
+ "r":0.9685649203,
296
+ "f":0.9596027985
297
  },
298
  "dobj":{
299
+ "p":0.9220839813,
300
+ "r":0.9449358515,
301
+ "f":0.9333700657
302
  },
303
  "nummod":{
304
+ "p":0.9399338254,
305
+ "r":0.9325757576,
306
+ "f":0.9362403346
307
  },
308
  "npadvmod":{
309
+ "p":0.7793445122,
310
+ "r":0.7264653641,
311
+ "f":0.7519764663
312
  },
313
  "prt":{
314
+ "p":0.8145094806,
315
+ "r":0.8853046595,
316
+ "f":0.8484328038
317
  },
318
  "pcomp":{
319
+ "p":0.8889679715,
320
+ "r":0.8746498599,
321
+ "f":0.8817507942
322
  },
323
  "expl":{
324
+ "p":0.983014862,
325
  "r":0.9914346895,
326
+ "f":0.987206823
327
  },
328
  "acl":{
329
+ "p":0.7449741528,
330
+ "r":0.7075831969,
331
+ "f":0.7257974259
332
  },
333
  "agent":{
334
+ "p":0.8957264957,
335
+ "r":0.9390681004,
336
+ "f":0.9168853893
337
  },
338
  "dative":{
339
+ "p":0.7732997481,
340
+ "r":0.7041284404,
341
+ "f":0.7370948379
342
  },
343
  "acomp":{
344
+ "p":0.9094236048,
345
+ "r":0.9015873016,
346
+ "f":0.9054884992
347
  },
348
  "dep":{
349
+ "p":0.3909465021,
350
+ "r":0.1542207792,
351
+ "f":0.2211874272
352
  },
353
  "csubj":{
354
+ "p":0.8098591549,
355
+ "r":0.6804733728,
356
+ "f":0.7395498392
357
  },
358
  "quantmod":{
359
+ "p":0.8739800544,
360
+ "r":0.7831031682,
361
+ "f":0.8260497001
362
  },
363
  "nmod":{
364
+ "p":0.7614457831,
365
+ "r":0.5776965265,
366
+ "f":0.656964657
367
  },
368
  "appos":{
369
+ "p":0.6850678733,
370
+ "r":0.6568329718,
371
+ "f":0.6706533776
372
  },
373
  "predet":{
374
+ "p":0.8467741935,
375
+ "r":0.9012875536,
376
+ "f":0.8731808732
377
  },
378
  "preconj":{
379
+ "p":0.5454545455,
380
+ "r":0.6279069767,
381
+ "f":0.5837837838
382
  },
383
  "oprd":{
384
+ "p":0.8413793103,
385
+ "r":0.728358209,
386
+ "f":0.7808
387
  },
388
  "parataxis":{
389
+ "p":0.6129943503,
390
+ "r":0.4707158351,
391
+ "f":0.5325153374
392
  },
393
  "meta":{
394
+ "p":0.8,
395
+ "r":0.3076923077,
396
+ "f":0.4444444444
397
  },
398
  "csubjpass":{
399
+ "p":0.5714285714,
400
+ "r":0.6666666667,
401
+ "f":0.6153846154
402
  }
403
  },
404
+ "ents_p":0.8531330602,
405
+ "ents_r":0.8448016827,
406
+ "ents_f":0.8489469314,
407
  "ents_per_type":{
408
  "DATE":{
409
+ "p":0.8645998102,
410
+ "r":0.8676190476,
411
+ "f":0.8661067977
412
  },
413
  "GPE":{
414
+ "p":0.9183846371,
415
  "r":0.9071129707,
416
+ "f":0.9127140051
417
  },
418
  "ORDINAL":{
419
+ "p":0.7765363128,
420
+ "r":0.8633540373,
421
+ "f":0.8176470588
422
+ },
423
+ "PERSON":{
424
+ "p":0.8805737449,
425
+ "r":0.9216710183,
426
+ "f":0.9006538032
427
  },
428
  "ORG":{
429
+ "p":0.8025329543,
430
+ "r":0.8231707317,
431
+ "f":0.8127208481
432
  },
433
  "QUANTITY":{
434
+ "p":0.7697841727,
435
+ "r":0.5879120879,
436
+ "f":0.6666666667
 
 
 
 
 
437
  },
438
  "CARDINAL":{
439
+ "p":0.8279202279,
440
+ "r":0.8638525565,
441
+ "f":0.8455048007
 
 
 
 
 
442
  },
443
  "NORP":{
444
+ "p":0.9102667745,
445
+ "r":0.9008,
446
+ "f":0.905508645
447
  },
448
+ "LOC":{
449
+ "p":0.7022058824,
450
+ "r":0.6082802548,
451
+ "f":0.6518771331
452
  },
453
  "FAC":{
454
+ "p":0.4122807018,
455
+ "r":0.3615384615,
456
+ "f":0.3852459016
 
 
 
 
 
457
  },
458
  "TIME":{
459
+ "p":0.7450980392,
460
+ "r":0.6666666667,
461
+ "f":0.7037037037
462
  },
463
+ "PRODUCT":{
464
+ "p":0.6376811594,
465
+ "r":0.2085308057,
466
+ "f":0.3142857143
467
+ },
468
+ "MONEY":{
469
+ "p":0.9027611044,
470
+ "r":0.8878394333,
471
+ "f":0.8952380952
472
  },
473
  "EVENT":{
474
+ "p":0.6043956044,
475
+ "r":0.316091954,
476
+ "f":0.4150943396
477
+ },
478
+ "WORK_OF_ART":{
479
+ "p":0.5317460317,
480
+ "r":0.3453608247,
481
+ "f":0.41875
482
  },
483
  "LAW":{
484
+ "p":0.4666666667,
485
+ "r":0.328125,
486
+ "f":0.3853211009
487
  },
488
  "PERCENT":{
489
+ "p":0.9090909091,
490
+ "r":0.8728943338,
491
+ "f":0.890625
492
  },
493
  "LANGUAGE":{
494
+ "p":0.6956521739,
495
+ "r":0.5,
496
+ "f":0.5818181818
497
  }
498
+ },
499
+ "speed":7620.1455610511
500
  },
501
  "sources":[
502
  {
ner/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ed516fb1bc5645b65a975d01d0dcdb8055606aa9400aa1fdaf6346262dea887f
3
- size 6956943
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6cd7f9b1abdedfc181c608ba07ca44304f51c3d09e95e15f383c7cdc0ae3d4af
3
+ size 7106353
parser/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ca184f93dab0aa09d42e770ff8f2e0f25bd3dcd61a3094005b76baee08a13ea8
3
  size 319909
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1af8456fd324806b01f1902e781ba0db2b15c0d7a6c00520d1bf8b45c90de0b0
3
  size 319909
senter/cfg CHANGED
@@ -1,3 +1,3 @@
1
  {
2
-
3
  }
 
1
  {
2
+ "overwrite":false
3
  }
senter/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5ab78efbdd20832176d3ca143e2dec9017e15b029111de3e5a0b225e285f7427
3
- size 213211
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b563f0023bade8fb3547672bfbaa14530e33bfd139c3b322d3c6983d63b91dac
3
+ size 219901
tagger/cfg CHANGED
@@ -49,5 +49,6 @@
49
  "WRB",
50
  "XX",
51
  "``"
52
- ]
 
53
  }
 
49
  "WRB",
50
  "XX",
51
  "``"
52
+ ],
53
+ "overwrite":false
54
  }
tagger/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b840670649c721fca2352795b6112880ca6d6b7c8c231e7ce1a857c1c68754aa
3
  size 19389
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:83fea6b25bf633d2bbc46f21938b9869ada0c6cd637b26e0a77fb526cbc778f0
3
  size 19389
tok2vec/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:16460efbfb7b3cab24041bec70e235ed8fb4f80c32b73606742960a948facfca
3
- size 6811418
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ecc6afa27b4c6945e9d785433a4bf3e39c6b132dd4e4ebc95c121a3c66108c5d
3
+ size 6960804
tokenizer CHANGED
The diff for this file is too large to render. See raw diff
 
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:af36b391672464e41eba46c48bd91ffc9cc5587b67f2122d70ac68c68f1a56fb
3
- size 9622332
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:266477264a21daaabda7d6b1200598290e20aaf2c72ebaf6e2a671f282f5e2bc
3
+ size 9695169
vocab/vectors.cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "mode":"default"
3
+ }