oroszgy commited on
Commit
e58cae3
1 Parent(s): e921a77

Update spacy pipeline to 3.6.0

Browse files
README.md CHANGED
@@ -14,55 +14,55 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.8557640751
18
  - name: NER Recall
19
  type: recall
20
- value: 0.8417721519
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.8487104493
24
  - task:
25
  name: TAG
26
  type: token-classification
27
  metrics:
28
  - name: TAG (XPOS) Accuracy
29
  type: accuracy
30
- value: 0.9633953778
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
  - name: POS (UPOS) Accuracy
36
  type: accuracy
37
- value: 0.964398507
38
  - task:
39
  name: MORPH
40
  type: token-classification
41
  metrics:
42
  - name: Morph (UFeats) Accuracy
43
  type: accuracy
44
- value: 0.9338692698
45
  - task:
46
  name: LEMMA
47
  type: token-classification
48
  metrics:
49
  - name: Lemma Accuracy
50
  type: accuracy
51
- value: 0.9724428284
52
  - task:
53
  name: UNLABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
  - name: Unlabeled Attachment Score (UAS)
57
  type: f_score
58
- value: 0.7978436658
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
  - name: Labeled Attachment Score (LAS)
64
  type: f_score
65
- value: 0.7223314055
66
  - task:
67
  name: SENTS
68
  type: token-classification
@@ -76,12 +76,12 @@ Core Hungarian model for HuSpaCy. Components: tok2vec, senter, tagger, morpholog
76
  | Feature | Description |
77
  | --- | --- |
78
  | **Name** | `hu_core_news_md` |
79
- | **Version** | `3.5.2` |
80
- | **spaCy** | `>=3.5.0,<3.6.0` |
81
  | **Default Pipeline** | `tok2vec`, `senter`, `tagger`, `morphologizer`, `lookup_lemmatizer`, `trainable_lemmatizer`, `parser`, `ner` |
82
  | **Components** | `tok2vec`, `senter`, `tagger`, `morphologizer`, `lookup_lemmatizer`, `trainable_lemmatizer`, `parser`, `ner` |
83
  | **Vectors** | -1 keys, 200000 unique vectors (100 dimensions) |
84
- | **Sources** | [UD Hungarian Szeged](https://universaldependencies.org/treebanks/hu_szeged/index.html) (Richárd Farkas, Katalin Simkó, Zsolt Szántó, Viktor Varga, Veronika Vincze (MTA-SZTE Research Group on Artificial Intelligence))<br />[NYTK-NerKor Corpus](https://github.com/nytud/NYTK-NerKor) (Eszter Simon, Noémi Vadász (Department of Language Technology and Applied Linguistics))<br />[hunNERwiki](http://hlt.sztaki.hu/resources/hunnerwiki.html) (Eszter Simon, Dávid Márk Nemeskey (HLT Group, Budapest University of Technology and Economics))<br />[Szeged NER Corpus](https://rgai.inf.u-szeged.hu/node/130) (György Szarvas, Richárd Farkas, László Felföldi, András Kocsor, János Csirik (MTA-SZTE Research Group on Artificial Intelligence)) |
85
  | **License** | `cc-by-sa-4.0` |
86
  | **Author** | [SzegedAI, MILAB](https://github.com/huspacy/huspacy) |
87
 
@@ -111,15 +111,15 @@ Core Hungarian model for HuSpaCy. Components: tok2vec, senter, tagger, morpholog
111
  | `SENTS_P` | 98.21 |
112
  | `SENTS_R` | 97.55 |
113
  | `SENTS_F` | 97.88 |
114
- | `TAG_ACC` | 96.34 |
115
- | `POS_ACC` | 96.44 |
116
- | `MORPH_ACC` | 93.39 |
117
- | `MORPH_MICRO_P` | 96.76 |
118
- | `MORPH_MICRO_R` | 95.92 |
119
- | `MORPH_MICRO_F` | 96.34 |
120
- | `LEMMA_ACC` | 97.24 |
121
- | `DEP_UAS` | 79.78 |
122
- | `DEP_LAS` | 72.23 |
123
- | `ENTS_P` | 85.58 |
124
- | `ENTS_R` | 84.18 |
125
- | `ENTS_F` | 84.87 |
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.8479221927
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.8430028129
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.8454553469
24
  - task:
25
  name: TAG
26
  type: token-classification
27
  metrics:
28
  - name: TAG (XPOS) Accuracy
29
  type: accuracy
30
+ value: 0.9640156953
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
  - name: POS (UPOS) Accuracy
36
  type: accuracy
37
+ value: 0.9655469423
38
  - task:
39
  name: MORPH
40
  type: token-classification
41
  metrics:
42
  - name: Morph (UFeats) Accuracy
43
  type: accuracy
44
+ value: 0.9339649727
45
  - task:
46
  name: LEMMA
47
  type: token-classification
48
  metrics:
49
  - name: Lemma Accuracy
50
  type: accuracy
51
+ value: 0.9730169362
52
  - task:
53
  name: UNLABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
  - name: Unlabeled Attachment Score (UAS)
57
  type: f_score
58
+ value: 0.8103583867
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
  - name: Labeled Attachment Score (LAS)
64
  type: f_score
65
+ value: 0.743357861
66
  - task:
67
  name: SENTS
68
  type: token-classification
 
76
  | Feature | Description |
77
  | --- | --- |
78
  | **Name** | `hu_core_news_md` |
79
+ | **Version** | `3.6.0` |
80
+ | **spaCy** | `>=3.6.0,<3.7.0` |
81
  | **Default Pipeline** | `tok2vec`, `senter`, `tagger`, `morphologizer`, `lookup_lemmatizer`, `trainable_lemmatizer`, `parser`, `ner` |
82
  | **Components** | `tok2vec`, `senter`, `tagger`, `morphologizer`, `lookup_lemmatizer`, `trainable_lemmatizer`, `parser`, `ner` |
83
  | **Vectors** | -1 keys, 200000 unique vectors (100 dimensions) |
84
+ | **Sources** | [UD Hungarian Szeged](https://universaldependencies.org/treebanks/hu_szeged/index.html) (Richárd Farkas, Katalin Simkó, Zsolt Szántó, Viktor Varga, Veronika Vincze (MTA-SZTE Research Group on Artificial Intelligence))<br />[NYTK-NerKor Corpus](https://github.com/nytud/NYTK-NerKor) (Eszter Simon, Noémi Vadász (Department of Language Technology and Applied Linguistics))<br />[Szeged NER Corpus](https://rgai.inf.u-szeged.hu/node/130) (György Szarvas, Richárd Farkas, László Felföldi, András Kocsor, János Csirik (MTA-SZTE Research Group on Artificial Intelligence))<br />[Hungarian lg Floret vectors](https://huggingface.co/huspacy/hu_vectors_web_lg) (Szeged AI) |
85
  | **License** | `cc-by-sa-4.0` |
86
  | **Author** | [SzegedAI, MILAB](https://github.com/huspacy/huspacy) |
87
 
 
111
  | `SENTS_P` | 98.21 |
112
  | `SENTS_R` | 97.55 |
113
  | `SENTS_F` | 97.88 |
114
+ | `TAG_ACC` | 96.40 |
115
+ | `POS_ACC` | 96.55 |
116
+ | `MORPH_ACC` | 93.40 |
117
+ | `MORPH_MICRO_P` | 96.93 |
118
+ | `MORPH_MICRO_R` | 96.11 |
119
+ | `MORPH_MICRO_F` | 96.52 |
120
+ | `LEMMA_ACC` | 97.30 |
121
+ | `DEP_UAS` | 81.04 |
122
+ | `DEP_LAS` | 74.34 |
123
+ | `ENTS_P` | 84.79 |
124
+ | `ENTS_R` | 84.30 |
125
+ | `ENTS_F` | 84.55 |
config.cfg CHANGED
@@ -1,8 +1,8 @@
1
  [paths]
2
- parser_model = "models/hu_core_news_md-parser-3.5.2/model-best"
3
- ner_model = "models/hu_core_news_md-ner-3.5.2/model-best"
4
- lemmatizer_lookups = "models/hu_core_news_md-lookup-lemmatizer-3.5.2"
5
- tagger_model = "models/hu_core_news_md-tagger-3.5.2/model-best"
6
  train = null
7
  dev = null
8
  vectors = null
@@ -32,6 +32,7 @@ source = ${paths.lemmatizer_lookups}
32
  [components.morphologizer]
33
  factory = "morphologizer"
34
  extend = false
 
35
  overwrite = true
36
  scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
37
 
@@ -118,6 +119,7 @@ upstream = "*"
118
 
119
  [components.tagger]
120
  factory = "tagger"
 
121
  neg_prefix = "!"
122
  overwrite = false
123
  scorer = {"@scorers":"spacy.tagger_scorer.v1"}
 
1
  [paths]
2
+ parser_model = "models/hu_core_news_md-parser-3.6.0/model-best"
3
+ ner_model = "models/hu_core_news_md-ner-3.6.0/model-best"
4
+ lemmatizer_lookups = "models/hu_core_news_md-lookup-lemmatizer-3.6.0"
5
+ tagger_model = "models/hu_core_news_md-tagger-3.6.0/model-best"
6
  train = null
7
  dev = null
8
  vectors = null
 
32
  [components.morphologizer]
33
  factory = "morphologizer"
34
  extend = false
35
+ label_smoothing = 0.0
36
  overwrite = true
37
  scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
38
 
 
119
 
120
  [components.tagger]
121
  factory = "tagger"
122
+ label_smoothing = 0.0
123
  neg_prefix = "!"
124
  overwrite = false
125
  scorer = {"@scorers":"spacy.tagger_scorer.v1"}
hu_core_news_md-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7077e87b0093f3d1d2ce8786327346307a2f19405a6ecf5327c667853810baf7
3
- size 126880310
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f9d6fccef1fb7c657e44b246f8eadd6cc6336078522b7fb4b3a61af548667728
3
+ size 126873936
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"hu",
3
  "name":"core_news_md",
4
- "version":"3.5.2",
5
  "description":"Core Hungarian model for HuSpaCy. Components: tok2vec, senter, tagger, morphologizer, lemmatizer, parser, ner",
6
  "author":"SzegedAI, MILAB",
7
  "email":"gyorgy@orosz.link",
8
  "url":"https://github.com/huspacy/huspacy",
9
  "license":"cc-by-sa-4.0",
10
- "spacy_version":">=3.5.0,<3.6.0",
11
- "spacy_git_version":"Unknown",
12
  "vectors":{
13
  "width":100,
14
  "vectors":200000,
@@ -1271,82 +1271,82 @@
1271
  "sents_p":0.9820627803,
1272
  "sents_r":0.9755011136,
1273
  "sents_f":0.9787709497,
1274
- "tag_acc":0.9633953778,
1275
- "pos_acc":0.964398507,
1276
- "morph_acc":0.9338692698,
1277
- "morph_micro_p":0.9676174788,
1278
- "morph_micro_r":0.9592178771,
1279
- "morph_micro_f":0.9633993698,
1280
  "morph_per_feat":{
1281
  "Definite":{
1282
- "p":0.956837801,
1283
- "r":0.9827344844,
1284
- "f":0.9696132597
1285
  },
1286
  "PronType":{
1287
- "p":0.970814978,
1288
- "r":0.9729580574,
1289
- "f":0.9718853363
1290
  },
1291
  "Case":{
1292
- "p":0.9771084337,
1293
- "r":0.9614700652,
1294
- "f":0.9692261727
1295
  },
1296
  "Degree":{
1297
- "p":0.9371584699,
1298
- "r":0.8560732113,
1299
- "f":0.8947826087
1300
  },
1301
  "Number":{
1302
- "p":0.9876396885,
1303
- "r":0.977543154,
1304
- "f":0.9825654847
1305
  },
1306
  "Mood":{
1307
- "p":0.920824295,
1308
- "r":0.9412416851,
1309
- "f":0.9309210526
1310
  },
1311
  "Person":{
1312
- "p":0.9566666667,
1313
- "r":0.9440789474,
1314
- "f":0.9503311258
1315
  },
1316
  "Tense":{
1317
- "p":0.9609544469,
1318
- "r":0.9790055249,
1319
- "f":0.9698960044
1320
  },
1321
  "VerbForm":{
1322
- "p":0.9490291262,
1323
- "r":0.9406575782,
1324
- "f":0.9448248087
1325
  },
1326
  "Voice":{
1327
- "p":0.9549098196,
1328
- "r":0.9744376278,
1329
- "f":0.9645748988
1330
  },
1331
  "Number[psor]":{
1332
- "p":0.9707174231,
1333
- "r":0.9444444444,
1334
- "f":0.957400722
1335
  },
1336
  "Person[psor]":{
1337
- "p":0.972181552,
1338
- "r":0.9472182596,
1339
- "f":0.9595375723
1340
  },
1341
  "NumType":{
1342
- "p":0.9178743961,
1343
- "r":0.9268292683,
1344
- "f":0.9223300971
1345
  },
1346
  "Reflex":{
1347
  "p":1.0,
1348
- "r":0.625,
1349
- "f":0.7692307692
1350
  },
1351
  "Aspect":{
1352
  "p":0.0,
@@ -1364,114 +1364,114 @@
1364
  "f":1.0
1365
  }
1366
  },
1367
- "lemma_acc":0.9724428284,
1368
- "dep_uas":0.7978436658,
1369
- "dep_las":0.7223314055,
1370
  "dep_las_per_type":{
1371
  "det":{
1372
- "p":0.8576952823,
1373
- "r":0.8829617834,
1374
- "f":0.870145155
1375
  },
1376
  "amod:att":{
1377
- "p":0.8220064725,
1378
- "r":0.830744072,
1379
- "f":0.8263521757
1380
  },
1381
  "nsubj":{
1382
- "p":0.7247557003,
1383
- "r":0.6953125,
1384
- "f":0.7097288676
1385
  },
1386
  "advmod:mode":{
1387
- "p":0.4978723404,
1388
- "r":0.5735294118,
1389
- "f":0.5330296128
1390
  },
1391
  "nmod:att":{
1392
- "p":0.7560553633,
1393
- "r":0.7406779661,
1394
- "f":0.7482876712
1395
  },
1396
  "obl":{
1397
- "p":0.754789272,
1398
- "r":0.7092709271,
1399
- "f":0.7313225058
1400
  },
1401
  "obj":{
1402
- "p":0.8498896247,
1403
- "r":0.8651685393,
1404
- "f":0.8574610245
1405
  },
1406
  "root":{
1407
- "p":0.802690583,
1408
- "r":0.7973273942,
1409
- "f":0.8
1410
  },
1411
  "cc":{
1412
- "p":0.6831460674,
1413
- "r":0.64,
1414
- "f":0.6608695652
1415
  },
1416
  "conj":{
1417
- "p":0.4219858156,
1418
- "r":0.4958333333,
1419
- "f":0.4559386973
1420
  },
1421
  "advmod":{
1422
- "p":0.8369565217,
1423
  "r":0.8105263158,
1424
- "f":0.8235294118
1425
  },
1426
  "flat:name":{
1427
- "p":0.8617511521,
1428
- "r":0.8738317757,
1429
- "f":0.86774942
1430
  },
1431
  "appos":{
1432
- "p":0.35,
1433
- "r":0.2234042553,
1434
- "f":0.2727272727
1435
  },
1436
  "advcl":{
1437
- "p":0.2739726027,
1438
- "r":0.2040816327,
1439
- "f":0.2339181287
1440
  },
1441
  "advmod:tlocy":{
1442
- "p":0.6293436293,
1443
- "r":0.7086956522,
1444
- "f":0.6666666667
1445
  },
1446
  "ccomp:obj":{
1447
- "p":0.2702702703,
1448
- "r":0.303030303,
1449
- "f":0.2857142857
1450
  },
1451
  "mark":{
1452
- "p":0.8125,
1453
- "r":0.8227848101,
1454
- "f":0.8176100629
1455
  },
1456
  "compound:preverb":{
1457
- "p":0.9107142857,
1458
- "r":0.9357798165,
1459
- "f":0.9230769231
1460
  },
1461
  "advmod:locy":{
1462
- "p":0.9166666667,
1463
- "r":0.34375,
1464
- "f":0.5
1465
  },
1466
  "cop":{
1467
- "p":0.7407407407,
1468
- "r":0.487804878,
1469
- "f":0.5882352941
1470
  },
1471
  "nmod:obl":{
1472
- "p":0.1666666667,
1473
- "r":0.15,
1474
- "f":0.1578947368
1475
  },
1476
  "advmod:to":{
1477
  "p":0.0,
@@ -1479,99 +1479,104 @@
1479
  "f":0.0
1480
  },
1481
  "obj:lvc":{
1482
- "p":0.2,
1483
  "r":0.0833333333,
1484
- "f":0.1176470588
1485
  },
1486
  "ccomp:obl":{
1487
- "p":0.3636363636,
1488
- "r":0.375,
1489
- "f":0.3692307692
1490
  },
1491
  "iobj":{
1492
- "p":0.2142857143,
1493
  "r":0.4,
1494
- "f":0.2790697674
1495
  },
1496
  "dep":{
1497
  "p":0.0,
1498
  "r":0.0,
1499
  "f":0.0
1500
  },
1501
- "acl":{
1502
- "p":0.2916666667,
1503
- "r":0.1944444444,
1504
- "f":0.2333333333
1505
- },
1506
- "parataxis":{
1507
- "p":0.1428571429,
1508
- "r":0.0273972603,
1509
- "f":0.0459770115
1510
- },
1511
  "case":{
1512
- "p":0.8905472637,
1513
- "r":0.9132653061,
1514
- "f":0.9017632242
1515
  },
1516
  "csubj":{
1517
- "p":0.3846153846,
1518
  "r":0.2702702703,
1519
- "f":0.3174603175
 
 
 
 
 
1520
  },
1521
  "xcomp":{
1522
- "p":0.7820512821,
1523
- "r":0.8243243243,
1524
- "f":0.8026315789
1525
  },
1526
  "nummod":{
1527
- "p":0.625,
1528
- "r":0.376344086,
1529
- "f":0.4697986577
 
 
 
 
 
 
 
 
 
 
1530
  },
1531
  "advmod:tto":{
1532
  "p":0.5,
1533
- "r":0.3,
1534
- "f":0.375
1535
  },
1536
  "nmod":{
1537
- "p":0.1666666667,
1538
  "r":0.0909090909,
1539
- "f":0.1176470588
 
 
 
 
 
1540
  },
1541
  "aux":{
1542
- "p":0.7777777778,
1543
- "r":0.5833333333,
1544
- "f":0.6666666667
1545
  },
1546
  "advmod:tfrom":{
1547
  "p":0.0,
1548
  "r":0.0,
1549
  "f":0.0
1550
  },
1551
- "list":{
1552
- "p":0.0434782609,
1553
- "r":0.1666666667,
1554
- "f":0.0689655172
1555
- },
1556
  "goeswith":{
1557
  "p":0.0,
1558
  "r":0.0,
1559
  "f":0.0
1560
  },
1561
  "compound":{
1562
- "p":0.75,
1563
- "r":0.975,
1564
- "f":0.847826087
1565
  },
1566
  "obl:lvc":{
1567
  "p":0.0,
1568
  "r":0.0,
1569
  "f":0.0
1570
  },
1571
- "orphan":{
1572
- "p":0.0,
1573
- "r":0.0,
1574
- "f":0.0
1575
  },
1576
  "ccomp":{
1577
  "p":0.0,
@@ -1584,42 +1589,37 @@
1584
  "f":0.0
1585
  },
1586
  "advmod:que":{
1587
- "p":0.0,
1588
- "r":0.0,
1589
- "f":0.0
1590
- },
1591
- "ccomp:pred":{
1592
- "p":0.0,
1593
- "r":0.0,
1594
- "f":0.0
1595
  }
1596
  },
1597
- "ents_p":0.8557640751,
1598
- "ents_r":0.8417721519,
1599
- "ents_f":0.8487104493,
1600
  "ents_per_type":{
1601
  "ORG":{
1602
- "p":0.8894073728,
1603
- "r":0.8836346778,
1604
- "f":0.8865116279
1605
  },
1606
  "PER":{
1607
- "p":0.8679692126,
1608
- "r":0.8757467145,
1609
- "f":0.8718406185
1610
  },
1611
  "LOC":{
1612
- "p":0.8407376362,
1613
- "r":0.8706597222,
1614
- "f":0.8554371002
1615
  },
1616
  "MISC":{
1617
- "p":0.7245614035,
1618
- "r":0.5858156028,
1619
- "f":0.6478431373
1620
  }
1621
  },
1622
- "speed":2553.5927991195
1623
  },
1624
  "sources":[
1625
  {
@@ -1634,17 +1634,17 @@
1634
  "license":"CC BY-SA 4.0",
1635
  "author":"Eszter Simon, No\u00e9mi Vad\u00e1sz (Department of Language Technology and Applied Linguistics)"
1636
  },
1637
- {
1638
- "name":"hunNERwiki",
1639
- "url":"http://hlt.sztaki.hu/resources/hunnerwiki.html",
1640
- "license":"CC-BY-SA-3.0",
1641
- "author":"Eszter Simon, D\u00e1vid M\u00e1rk Nemeskey (HLT Group, Budapest University of Technology and Economics)"
1642
- },
1643
  {
1644
  "name":"Szeged NER Corpus",
1645
  "url":"https://rgai.inf.u-szeged.hu/node/130",
1646
  "license":"CC-BY-NC-SA-3.0",
1647
  "author":"Gy\u00f6rgy Szarvas, Rich\u00e1rd Farkas, L\u00e1szl\u00f3 Felf\u00f6ldi, Andr\u00e1s Kocsor, J\u00e1nos Csirik (MTA-SZTE Research Group on Artificial Intelligence)"
 
 
 
 
 
 
1648
  }
1649
  ],
1650
  "requirements":[
 
1
  {
2
  "lang":"hu",
3
  "name":"core_news_md",
4
+ "version":"3.6.0",
5
  "description":"Core Hungarian model for HuSpaCy. Components: tok2vec, senter, tagger, morphologizer, lemmatizer, parser, ner",
6
  "author":"SzegedAI, MILAB",
7
  "email":"gyorgy@orosz.link",
8
  "url":"https://github.com/huspacy/huspacy",
9
  "license":"cc-by-sa-4.0",
10
+ "spacy_version":">=3.6.0,<3.7.0",
11
+ "spacy_git_version":"6fc153a26",
12
  "vectors":{
13
  "width":100,
14
  "vectors":200000,
 
1271
  "sents_p":0.9820627803,
1272
  "sents_r":0.9755011136,
1273
  "sents_f":0.9787709497,
1274
+ "tag_acc":0.9640156953,
1275
+ "pos_acc":0.9655469423,
1276
+ "morph_acc":0.9339649727,
1277
+ "morph_micro_p":0.9693147835,
1278
+ "morph_micro_r":0.9611087237,
1279
+ "morph_micro_f":0.965194312,
1280
  "morph_per_feat":{
1281
  "Definite":{
1282
+ "p":0.9694727105,
1283
+ "r":0.9780681288,
1284
+ "f":0.9737514518
1285
  },
1286
  "PronType":{
1287
+ "p":0.977827051,
1288
+ "r":0.9735099338,
1289
+ "f":0.9756637168
1290
  },
1291
  "Case":{
1292
+ "p":0.9769246071,
1293
+ "r":0.9703615886,
1294
+ "f":0.9736320381
1295
  },
1296
  "Degree":{
1297
+ "p":0.9202834367,
1298
+ "r":0.8643926789,
1299
+ "f":0.8914628915
1300
  },
1301
  "Number":{
1302
+ "p":0.985019357,
1303
+ "r":0.9807273337,
1304
+ "f":0.9828686597
1305
  },
1306
  "Mood":{
1307
+ "p":0.9296703297,
1308
+ "r":0.9379157428,
1309
+ "f":0.9337748344
1310
  },
1311
  "Person":{
1312
+ "p":0.9578163772,
1313
+ "r":0.9523026316,
1314
+ "f":0.9550515464
1315
  },
1316
  "Tense":{
1317
+ "p":0.9681318681,
1318
+ "r":0.973480663,
1319
+ "f":0.9707988981
1320
  },
1321
  "VerbForm":{
1322
+ "p":0.9611486486,
1323
+ "r":0.9125902165,
1324
+ "f":0.9362402304
1325
  },
1326
  "Voice":{
1327
+ "p":0.9634888438,
1328
+ "r":0.9713701431,
1329
+ "f":0.967413442
1330
  },
1331
  "Number[psor]":{
1332
+ "p":0.9709302326,
1333
+ "r":0.9515669516,
1334
+ "f":0.9611510791
1335
  },
1336
  "Person[psor]":{
1337
+ "p":0.9723837209,
1338
+ "r":0.9543509272,
1339
+ "f":0.9632829374
1340
  },
1341
  "NumType":{
1342
+ "p":0.9011764706,
1343
+ "r":0.9341463415,
1344
+ "f":0.9173652695
1345
  },
1346
  "Reflex":{
1347
  "p":1.0,
1348
+ "r":0.875,
1349
+ "f":0.9333333333
1350
  },
1351
  "Aspect":{
1352
  "p":0.0,
 
1364
  "f":1.0
1365
  }
1366
  },
1367
+ "lemma_acc":0.9730169362,
1368
+ "dep_uas":0.8103583867,
1369
+ "dep_las":0.743357861,
1370
  "dep_las_per_type":{
1371
  "det":{
1372
+ "p":0.8618524333,
1373
+ "r":0.8742038217,
1374
+ "f":0.8679841897
1375
  },
1376
  "amod:att":{
1377
+ "p":0.8163580247,
1378
+ "r":0.8650858545,
1379
+ "f":0.8400158793
1380
  },
1381
  "nsubj":{
1382
+ "p":0.7198748044,
1383
+ "r":0.71875,
1384
+ "f":0.7193119625
1385
  },
1386
  "advmod:mode":{
1387
+ "p":0.5789473684,
1388
+ "r":0.5392156863,
1389
+ "f":0.5583756345
1390
  },
1391
  "nmod:att":{
1392
+ "p":0.7376788553,
1393
+ "r":0.786440678,
1394
+ "f":0.7612797375
1395
  },
1396
  "obl":{
1397
+ "p":0.7789954338,
1398
+ "r":0.7677767777,
1399
+ "f":0.7733454216
1400
  },
1401
  "obj":{
1402
+ "p":0.8280542986,
1403
+ "r":0.8224719101,
1404
+ "f":0.825253664
1405
  },
1406
  "root":{
1407
+ "p":0.8183856502,
1408
+ "r":0.8129175947,
1409
+ "f":0.8156424581
1410
  },
1411
  "cc":{
1412
+ "p":0.7096774194,
1413
+ "r":0.6947368421,
1414
+ "f":0.7021276596
1415
  },
1416
  "conj":{
1417
+ "p":0.4771784232,
1418
+ "r":0.4791666667,
1419
+ "f":0.4781704782
1420
  },
1421
  "advmod":{
1422
+ "p":0.8279569892,
1423
  "r":0.8105263158,
1424
+ "f":0.8191489362
1425
  },
1426
  "flat:name":{
1427
+ "p":0.8451327434,
1428
+ "r":0.8925233645,
1429
+ "f":0.8681818182
1430
  },
1431
  "appos":{
1432
+ "p":0.3837209302,
1433
+ "r":0.3510638298,
1434
+ "f":0.3666666667
1435
  },
1436
  "advcl":{
1437
+ "p":0.2941176471,
1438
+ "r":0.306122449,
1439
+ "f":0.3
1440
  },
1441
  "advmod:tlocy":{
1442
+ "p":0.688034188,
1443
+ "r":0.7,
1444
+ "f":0.6939655172
1445
  },
1446
  "ccomp:obj":{
1447
+ "p":0.3513513514,
1448
+ "r":0.3939393939,
1449
+ "f":0.3714285714
1450
  },
1451
  "mark":{
1452
+ "p":0.8113207547,
1453
+ "r":0.8164556962,
1454
+ "f":0.8138801262
1455
  },
1456
  "compound:preverb":{
1457
+ "p":0.9203539823,
1458
+ "r":0.9541284404,
1459
+ "f":0.9369369369
1460
  },
1461
  "advmod:locy":{
1462
+ "p":0.8235294118,
1463
+ "r":0.4375,
1464
+ "f":0.5714285714
1465
  },
1466
  "cop":{
1467
+ "p":0.6666666667,
1468
+ "r":0.5365853659,
1469
+ "f":0.5945945946
1470
  },
1471
  "nmod:obl":{
1472
+ "p":0.2162162162,
1473
+ "r":0.2,
1474
+ "f":0.2077922078
1475
  },
1476
  "advmod:to":{
1477
  "p":0.0,
 
1479
  "f":0.0
1480
  },
1481
  "obj:lvc":{
1482
+ "p":0.5,
1483
  "r":0.0833333333,
1484
+ "f":0.1428571429
1485
  },
1486
  "ccomp:obl":{
1487
+ "p":0.28,
1488
+ "r":0.21875,
1489
+ "f":0.2456140351
1490
  },
1491
  "iobj":{
1492
+ "p":0.3157894737,
1493
  "r":0.4,
1494
+ "f":0.3529411765
1495
  },
1496
  "dep":{
1497
  "p":0.0,
1498
  "r":0.0,
1499
  "f":0.0
1500
  },
 
 
 
 
 
 
 
 
 
 
1501
  "case":{
1502
+ "p":0.9432989691,
1503
+ "r":0.9336734694,
1504
+ "f":0.9384615385
1505
  },
1506
  "csubj":{
1507
+ "p":0.5882352941,
1508
  "r":0.2702702703,
1509
+ "f":0.3703703704
1510
+ },
1511
+ "parataxis":{
1512
+ "p":0.2727272727,
1513
+ "r":0.0410958904,
1514
+ "f":0.0714285714
1515
  },
1516
  "xcomp":{
1517
+ "p":0.8985507246,
1518
+ "r":0.8378378378,
1519
+ "f":0.8671328671
1520
  },
1521
  "nummod":{
1522
+ "p":0.6282051282,
1523
+ "r":0.5268817204,
1524
+ "f":0.5730994152
1525
+ },
1526
+ "acl":{
1527
+ "p":0.3846153846,
1528
+ "r":0.2777777778,
1529
+ "f":0.3225806452
1530
+ },
1531
+ "orphan":{
1532
+ "p":0.0,
1533
+ "r":0.0,
1534
+ "f":0.0
1535
  },
1536
  "advmod:tto":{
1537
  "p":0.5,
1538
+ "r":0.1,
1539
+ "f":0.1666666667
1540
  },
1541
  "nmod":{
1542
+ "p":1.0,
1543
  "r":0.0909090909,
1544
+ "f":0.1666666667
1545
+ },
1546
+ "ccomp:pred":{
1547
+ "p":0.0,
1548
+ "r":0.0,
1549
+ "f":0.0
1550
  },
1551
  "aux":{
1552
+ "p":0.9,
1553
+ "r":0.75,
1554
+ "f":0.8181818182
1555
  },
1556
  "advmod:tfrom":{
1557
  "p":0.0,
1558
  "r":0.0,
1559
  "f":0.0
1560
  },
 
 
 
 
 
1561
  "goeswith":{
1562
  "p":0.0,
1563
  "r":0.0,
1564
  "f":0.0
1565
  },
1566
  "compound":{
1567
+ "p":0.9487179487,
1568
+ "r":0.925,
1569
+ "f":0.9367088608
1570
  },
1571
  "obl:lvc":{
1572
  "p":0.0,
1573
  "r":0.0,
1574
  "f":0.0
1575
  },
1576
+ "list":{
1577
+ "p":0.2,
1578
+ "r":0.1666666667,
1579
+ "f":0.1818181818
1580
  },
1581
  "ccomp":{
1582
  "p":0.0,
 
1589
  "f":0.0
1590
  },
1591
  "advmod:que":{
1592
+ "p":1.0,
1593
+ "r":0.25,
1594
+ "f":0.4
 
 
 
 
 
1595
  }
1596
  },
1597
+ "ents_p":0.8479221927,
1598
+ "ents_r":0.8430028129,
1599
+ "ents_f":0.8454553469,
1600
  "ents_per_type":{
1601
  "ORG":{
1602
+ "p":0.882924572,
1603
+ "r":0.8845618915,
1604
+ "f":0.8837424734
1605
  },
1606
  "PER":{
1607
+ "p":0.8772969769,
1608
+ "r":0.8841099164,
1609
+ "f":0.8806902708
1610
  },
1611
  "LOC":{
1612
+ "p":0.84,
1613
+ "r":0.8567708333,
1614
+ "f":0.8483025355
1615
  },
1616
  "MISC":{
1617
+ "p":0.664556962,
1618
+ "r":0.5957446809,
1619
+ "f":0.6282722513
1620
  }
1621
  },
1622
+ "speed":2618.8542210874
1623
  },
1624
  "sources":[
1625
  {
 
1634
  "license":"CC BY-SA 4.0",
1635
  "author":"Eszter Simon, No\u00e9mi Vad\u00e1sz (Department of Language Technology and Applied Linguistics)"
1636
  },
 
 
 
 
 
 
1637
  {
1638
  "name":"Szeged NER Corpus",
1639
  "url":"https://rgai.inf.u-szeged.hu/node/130",
1640
  "license":"CC-BY-NC-SA-3.0",
1641
  "author":"Gy\u00f6rgy Szarvas, Rich\u00e1rd Farkas, L\u00e1szl\u00f3 Felf\u00f6ldi, Andr\u00e1s Kocsor, J\u00e1nos Csirik (MTA-SZTE Research Group on Artificial Intelligence)"
1642
+ },
1643
+ {
1644
+ "name":"Hungarian lg Floret vectors",
1645
+ "url":"https://huggingface.co/huspacy/hu_vectors_web_lg",
1646
+ "license":"CC-BY-SA-4.0",
1647
+ "author":"Szeged AI"
1648
  }
1649
  ],
1650
  "requirements":[
morphologizer/cfg CHANGED
@@ -1,5 +1,6 @@
1
  {
2
  "extend":false,
 
3
  "labels_morph":{
4
  "Definite=Def|POS=DET|PronType=Art":"Definite=Def|PronType=Art",
5
  "Case=Ine|Number=Sing|POS=NOUN":"Case=Ine|Number=Sing",
 
1
  {
2
  "extend":false,
3
+ "label_smoothing":0.0,
4
  "labels_morph":{
5
  "Definite=Def|POS=DET|PronType=Art":"Definite=Def|PronType=Art",
6
  "Case=Ine|Number=Sing|POS=NOUN":"Case=Ine|Number=Sing",
morphologizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e6aff4240ac18467b7dc06107b9c062f36f886d7ce23807af3f8406749a7f14b
3
  size 463022
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2d43683d554c8c726ef065c0db115d588004bf892c00d7032b65456ddb0fa6d9
3
  size 463022
ner/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3ed6dcbe8e6fffa7498e78a2574d8043591cb5d2bb6f9c108fa8e2f1b77f9c30
3
  size 9791307
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:364eda709e2edbfac89812efe2353bb8cecd2854ea39f900e1bd98ce6751ea66
3
  size 9791307
parser/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:78283bea80e3386ef410e1d15c64f01b744d5c5e0fecf2bb19e47229f913fa56
3
  size 25601129
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fee779591ece0e3d7caf9277c67269f4ec014108863356ede7f37ad457ad384f
3
  size 25601129
senter/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b18414fd9e8488e9a3ddd9526bd4d2def3f89bc86cbc29f42913f292c613fbe1
3
  size 1237
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dfb4dcfe2a876d6bade5bf405e9835efe65bb71092aaa1aa86d1ccdc2b255e0a
3
  size 1237
tagger/cfg CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "labels":[
3
  "ADJ",
4
  "ADP",
 
1
  {
2
+ "label_smoothing":0.0,
3
  "labels":[
4
  "ADJ",
5
  "ADP",
tagger/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6fb3702150c4d01ca856cdb2e672ebef74bb6fcc26598cd8e9299dbd696544de
3
  size 7297
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:632c6f38ad97c0336e65a15ee41d097f45d71233914ea41f2e7010d6af4c89f6
3
  size 7297
tok2vec/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:688fde3edb6700bbe83c5058ee028879580e4886521245e8bdacea95086b4c7b
3
  size 9659749
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7092f2f38ddf3e799dadff6a131296038901b22d90c44940a6a453104db5fc2e
3
  size 9659749
trainable_lemmatizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d31925eb4ba242c6b1352d8e203416e400ba9cbd06cb709a0caeabd85bac1649
3
  size 11282980
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:034f6cf0204dd1b27ebac3b5d66fa99ea8cd96e3df746325e228949186826a20
3
  size 11282980
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e84f6f9c6885355eea81dfb3b4e9ca437d1300fcde1cf2895f55aa03f82e8372
3
- size 6405534
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8d5610c17288cd7c1db421eb081e29b13e79705966a21722c748c310f2f1f905
3
+ size 6406437
vocab/vectors.cfg CHANGED
@@ -5,5 +5,6 @@
5
  "hash_count":2,
6
  "hash_seed":2166136261,
7
  "bow":"<",
8
- "eow":">"
 
9
  }
 
5
  "hash_count":2,
6
  "hash_seed":2166136261,
7
  "bow":"<",
8
+ "eow":">",
9
+ "attr":65
10
  }