KennethEnevoldsen commited on
Commit
0eadea0
1 Parent(s): 9b34c07

Updated to version v0.2.0

Browse files
.gitattributes CHANGED
@@ -20,3 +20,4 @@
20
  *strings.json filter=lfs diff=lfs merge=lfs -text
21
  vectors filter=lfs diff=lfs merge=lfs -text
22
  model filter=lfs diff=lfs merge=lfs -text
 
 
20
  *strings.json filter=lfs diff=lfs merge=lfs -text
21
  vectors filter=lfs diff=lfs merge=lfs -text
22
  model filter=lfs diff=lfs merge=lfs -text
23
+ entity_linker/kb/* filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,12 +1,22 @@
1
  ---
2
  tags:
3
  - spacy
 
 
4
  - token-classification
 
 
 
 
 
 
 
 
5
  language:
6
  - da
7
  license: apache-2.0
8
  model-index:
9
- - name: da_dacy_small_trf
10
  results:
11
  - task:
12
  name: NER
@@ -14,75 +24,172 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.81724846
18
  - name: NER Recall
19
  type: recall
20
- value: 0.8291666667
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.8231644261
 
 
 
 
24
  - task:
25
- name: SENTER
26
  type: token-classification
27
  metrics:
28
- - name: SENTER Precision
29
- type: precision
30
- value: 0.8603839442
31
- - name: SENTER Recall
32
- type: recall
33
- value: 0.8741134752
34
- - name: SENTER F Score
35
- type: f_score
36
- value: 0.8671943712
37
  - task:
38
- name: UNLABELED_DEPENDENCIES
39
  type: token-classification
40
  metrics:
41
- - name: Unlabeled Dependencies Accuracy
42
  type: accuracy
43
- value: 0.8492442546
 
 
 
 
 
44
  - task:
45
- name: LABELED_DEPENDENCIES
 
 
 
 
 
 
 
 
 
 
 
 
46
  type: token-classification
47
  metrics:
48
- - name: Labeled Dependencies Accuracy
49
  type: accuracy
50
- value: 0.8492442546
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
  ---
52
 
53
  <a href="https://github.com/centre-for-humanities-computing/Dacy"><img src="https://centre-for-humanities-computing.github.io/DaCy/_static/icon.png" width="175" height="175" align="right" /></a>
54
 
55
- # DaCy small transformer
56
-
57
 
58
  DaCy is a Danish language processing framework with state-of-the-art pipelines as well as functionality for analysing Danish pipelines.
59
- DaCy's largest pipeline has achieved State-of-the-Art performance on Named entity recognition, part-of-speech tagging and dependency
60
- parsing for Danish on the DaNE dataset. Check out the [DaCy repository](https://github.com/centre-for-humanities-computing/DaCy) for material on how to use DaCy and reproduce the results.
 
61
  DaCy also contains guides on usage of the package as well as behavioural test for biases and robustness of Danish NLP pipelines.
62
-
63
 
64
  | Feature | Description |
65
  | --- | --- |
66
  | **Name** | `da_dacy_small_trf` |
67
- | **Version** | `0.1.0` |
68
- | **spaCy** | `>=3.1.1,<3.2.0` |
69
- | **Default Pipeline** | `transformer`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
70
- | **Components** | `transformer`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
71
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
72
- | **Sources** | [UD Danish DDT v2.5](https://github.com/UniversalDependencies/UD_Danish-DDT) (Johannsen, Anders; Martínez Alonso, Héctor; Plank, Barbara)<br />[DaNE](https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane) (Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders Søgaard)<br />[Maltehb/-l-ctra-danish-electra-small-cased](https://huggingface.co/Maltehb/-l-ctra-danish-electra-small-cased) (Malte Højmark-Bertelsen) |
73
- | **License** | `Apache-2.0 License` |
74
- | **Author** | [Centre for Humanities Computing Aarhus](https://chcaa.io/#/) |
75
 
76
  ### Label Scheme
77
 
78
  <details>
79
 
80
- <summary>View label scheme (192 labels for 3 components)</summary>
81
 
82
  | Component | Labels |
83
  | --- | --- |
84
- | **`morphologizer`** | `AdpType=Prep\|POS=ADP`, `Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=AUX\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=PROPN`, `Definite=Ind\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=SCONJ`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=ADV`, `Number=Plur\|POS=DET\|PronType=Dem`, `Degree=Pos\|Number=Plur\|POS=ADJ`, `Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=PUNCT`, `POS=CCONJ`, `Definite=Ind\|Degree=Cmp\|Number=Sing\|POS=ADJ`, `Degree=Cmp\|POS=ADJ`, `POS=PRON\|PartType=Inf`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Definite=Ind\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Acc\|Gender=Neut\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Dem`, `Degree=Pos\|POS=ADV`, `Definite=Def\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=PRON\|PronType=Dem`, `NumType=Card\|POS=NUM`, `Definite=Ind\|Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `NumType=Ord\|POS=ADJ`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Mood=Ind\|POS=AUX\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=VERB\|VerbForm=Inf\|Voice=Act`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Pass`, `POS=ADP\|PartType=Inf`, `Degree=Pos\|POS=ADJ`, `Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `POS=AUX\|VerbForm=Inf\|Voice=Act`, `Definite=Ind\|Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Number=Plur\|POS=DET\|PronType=Ind`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Ind`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `POS=PART\|PartType=Inf`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Acc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Case=Nom\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Nom\|Gender=Com\|POS=PRON\|PronType=Ind`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Ind`, `Mood=Imp\|POS=VERB`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Definite=Ind\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=X`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Number=Plur\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|VerbForm=Inf\|Voice=Pass`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Degree=Cmp\|POS=ADV`, `POS=ADV\|PartType=Inf`, `Degree=Sup\|POS=ADV`, `Number=Plur\|POS=PRON\|PronType=Dem`, `Number=Plur\|POS=PRON\|PronType=Ind`, `Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|POS=PROPN`, `POS=ADP`, `Degree=Cmp\|Number=Plur\|POS=ADJ`, `Definite=Def\|Degree=Sup\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Degree=Pos\|Number=Sing\|POS=ADJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Gender=Com\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Number=Plur\|POS=PRON\|PronType=Rcp`, `Case=Gen\|Degree=Cmp\|POS=ADJ`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=INTJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Number=Plur\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Definite=Def\|Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Definite=Ind\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Plur\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `POS=SYM`, `Case=Nom\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Degree=Sup\|POS=ADJ`, `Number=Plur\|POS=DET\|PronType=Ind\|Style=Arch`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Foreign=Yes\|POS=X`, `POS=DET\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Dem`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Case=Gen\|POS=PRON\|PronType=Int,Rel`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Dem`, `Abbr=Yes\|POS=X`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `Definite=Def\|Degree=Abs\|POS=ADJ`, `Definite=Ind\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Definite=Ind\|POS=NOUN`, `Gender=Com\|Number=Plur\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Gender=Com\|POS=PRON\|PronType=Int,Rel`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Degree=Abs\|POS=ADV`, `POS=VERB\|VerbForm=Ger`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Number=Plur\|Number[psor]=Plur\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Gen\|Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Gen\|Degree=Pos\|Number=Plur\|POS=ADJ`, `Case=Acc\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|Tense=Pres`, `Case=Gen\|Number=Plur\|POS=DET\|PronType=Ind`, `Number[psor]=Plur\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `POS=PRON\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Pass`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Definite=Ind\|Number=Plur\|POS=NOUN`, `Case=Gen\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Mood=Imp\|POS=AUX`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs`, `Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Gen\|POS=NOUN`, `Number[psor]=Plur\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=DET\|PronType=Dem`, `Definite=Def\|Number=Plur\|POS=NOUN` |
85
- | **`parser`** | `ROOT`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux`, `case`, `cc`, `ccomp`, `compound:prt`, `conj`, `cop`, `dep`, `det`, `expl`, `fixed`, `flat`, `iobj`, `list`, `mark`, `nmod`, `nmod:poss`, `nsubj`, `nummod`, `obj`, `obl`, `obl:loc`, `obl:tmod`, `punct`, `xcomp` |
 
86
  | **`ner`** | `LOC`, `MISC`, `ORG`, `PER` |
87
 
88
  </details>
@@ -91,104 +198,37 @@ DaCy also contains guides on usage of the package as well as behavioural test fo
91
 
92
  | Type | Score |
93
  | --- | --- |
94
- | `POS_ACC` | 95.83 |
95
- | `MORPH_ACC` | 95.70 |
96
- | `DEP_UAS` | 84.92 |
97
- | `DEP_LAS` | 81.76 |
98
- | `SENTS_P` | 86.04 |
99
- | `SENTS_R` | 87.41 |
100
- | `SENTS_F` | 86.72 |
101
- | `LEMMA_ACC` | 84.91 |
102
- | `ENTS_F` | 82.32 |
103
- | `ENTS_P` | 81.72 |
104
- | `ENTS_R` | 82.92 |
105
- | `TRANSFORMER_LOSS` | 41746686.63 |
106
- | `MORPHOLOGIZER_LOSS` | 3458966.49 |
107
- | `PARSER_LOSS` | 15104898.38 |
108
- | `NER_LOSS` | 546098.45 |
109
-
110
-
111
- ## Bias and Robustness
112
-
113
- Besides the validation done by SpaCy on the DaNE testset, DaCy also provides a series of augmentations to the DaNE test set to see how well the models deal with these types of augmentations.
114
- The can be seen as behavioural probes akinn to the NLP checklist.
115
-
116
- ### Deterministic Augmentations
117
- Deterministic augmentations are augmentation which always yield the same result.
118
-
119
- | Augmentation | Part-of-speech tagging (Accuracy) | Morphological tagging (Accuracy) | Dependency Parsing (UAS) | Dependency Parsing (LAS) | Sentence segmentation (F1) | Lemmatization (Accuracy) | Named entity recognition (F1) |
120
- | --- | --- | --- | --- | --- | --- | --- | --- |
121
- | No augmentation | 0.98 | 0.974 | 0.868 | 0.836 | 0.936 | 0.844 | 0.765 |
122
- | Æøå Augmentation | 0.955 | 0.948 | 0.823 | 0.783 | 0.922 | 0.754 | 0.718 |
123
- | Lowercase | 0.974 | 0.97 | 0.862 | 0.828 | 0.905 | 0.848 | 0.681 |
124
- | No Spacing | 0.229 | 0.229 | 0.004 | 0.003 | 0.824 | 0.225 | 0.048 |
125
- | Abbreviated first names | 0.979 | 0.973 | 0.864 | 0.832 | 0.94 | 0.845 | 0.699 |
126
- | Input size augmentation 5 sentences | 0.956 | 0.956 | 0.851 | 0.818 | 0.883 | 0.844 | 0.743 |
127
- | Input size augmentation 10 sentences | 0.959 | 0.958 | 0.853 | 0.821 | 0.897 | 0.844 | 0.755 |
128
-
129
-
130
-
131
- ### Stochastic Augmentations
132
- Stochastic augmentations are augmentation which are repeated mulitple times to estimate the effect of the augmentation.
133
-
134
- | Augmentation | Part-of-speech tagging (Accuracy) | Morphological tagging (Accuracy) | Dependency Parsing (UAS) | Dependency Parsing (LAS) | Sentence segmentation (F1) | Lemmatization (Accuracy) | Named entity recognition (F1) |
135
- | --- | --- | --- | --- | --- | --- | --- | --- |
136
- | Keystroke errors 2% | 0.931 (0.003) | 0.929 (0.003) | 0.797 (0.003) | 0.753 (0.003) | 0.884 (0.003) | 0.772 (0.003) | 0.657 (0.003) |
137
- | Keystroke errors 5% | 0.859 (0.003) | 0.863 (0.003) | 0.699 (0.003) | 0.641 (0.003) | 0.824 (0.003) | 0.681 (0.003) | 0.53 (0.003) |
138
- | Keystroke errors 15% | 0.633 (0.006) | 0.662 (0.006) | 0.439 (0.006) | 0.358 (0.006) | 0.688 (0.006) | 0.459 (0.006) | 0.293 (0.006) |
139
- | Danish names | 0.979 (0.0) | 0.974 (0.0) | 0.867 (0.0) | 0.835 (0.0) | 0.943 (0.0) | 0.847 (0.0) | 0.748 (0.0) |
140
- | Muslim names | 0.979 (0.0) | 0.974 (0.0) | 0.865 (0.0) | 0.833 (0.0) | 0.94 (0.0) | 0.847 (0.0) | 0.732 (0.0) |
141
- | Female names | 0.979 (0.0) | 0.974 (0.0) | 0.867 (0.0) | 0.835 (0.0) | 0.946 (0.0) | 0.847 (0.0) | 0.754 (0.0) |
142
- | Male names | 0.979 (0.0) | 0.974 (0.0) | 0.867 (0.0) | 0.835 (0.0) | 0.943 (0.0) | 0.847 (0.0) | 0.748 (0.0) |
143
- | Spacing Augmention 5% | 0.941 (0.002) | 0.936 (0.002) | 0.755 (0.002) | 0.725 (0.002) | 0.907 (0.002) | 0.811 (0.002) | 0.699 (0.002) |
144
-
145
- <details>
146
-
147
- <summary> Description of Augmenters </summary>
148
-
149
-
150
-
151
- **No augmentation:**
152
- Applies no augmentation to the DaNE test set.
153
-
154
- **Æøå Augmentation:**
155
- This augmentation replace the æ,ø, and å with their spelling variations ae, oe and aa respectively.
156
-
157
- **Lowercase:**
158
- This augmentation lowercases all text.
159
-
160
- **No Spacing:**
161
- This augmentation removed all spacing from the text.
162
-
163
- **Abbreviated first names:**
164
- This agmentation abbreviates the first names of entities. For instance 'Kenneth Enevoldsen' would turn to 'K. Enevoldsen'.
165
-
166
- **Keystroke errors 2%:**
167
- This agmentation simulate keystroke errors by replacing 2% of keys with a neighbouring key on a Danish QWERTY keyboard. As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.
168
-
169
- **Keystroke errors 5%:**
170
- This agmentation simulate keystroke errors by replacing 5% of keys with a neighbouring key on a Danish QWERTY keyboard. As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.
171
-
172
- **Keystroke errors 15%:**
173
- This agmentation simulate keystroke errors by replacing 15% of keys with a neighbouring key on a Danish QWERTY keyboard. As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.
174
-
175
- **Danish names:**
176
- This agmentation replace all names with Danish names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.
177
-
178
- **Muslim names:**
179
- This agmentation replace all names with Muslim names derived from Meldgaard (2005). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.
180
-
181
- **Female names:**
182
- This agmentation replace all names with Danish female names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.
183
-
184
- **Male names:**
185
- This agmentation replace all names with Danish male names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.
186
-
187
- **Spacing Augmention 5%:**
188
- This agmentation replace all names with Danish male names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.
189
- </details>
190
- <br />
191
-
192
-
193
- ### Hardware
194
- This was run an trained on a Quadro RTX 8000 GPU.
 
1
  ---
2
  tags:
3
  - spacy
4
+ - dacy
5
+ - danish
6
  - token-classification
7
+ - pos tagging
8
+ - morphological analysis
9
+ - lemmatization
10
+ - dependency parsing
11
+ - named entity recognition
12
+ - coreference resolution
13
+ - named entity linking
14
+ - named entity disambiguation
15
  language:
16
  - da
17
  license: apache-2.0
18
  model-index:
19
+ - name: da_dacy_small_trf-0.2.0
20
  results:
21
  - task:
22
  name: NER
 
24
  metrics:
25
  - name: NER Precision
26
  type: precision
27
+ value: 0.8306010929
28
  - name: NER Recall
29
  type: recall
30
+ value: 0.8172043011
31
  - name: NER F Score
32
  type: f_score
33
+ value: 0.8238482385
34
+ dataset:
35
+ name: DaNE
36
+ split: test
37
+ type: dane
38
  - task:
39
+ name: TAG
40
  type: token-classification
41
  metrics:
42
+ - name: TAG (XPOS) Accuracy
43
+ type: accuracy
44
+ value: 0.9846798742
45
+ dataset:
46
+ name: UD Danish DDT
47
+ split: test
48
+ type: universal_dependencies
49
+ config: da_ddt
 
50
  - task:
51
+ name: POS
52
  type: token-classification
53
  metrics:
54
+ - name: POS (UPOS) Accuracy
55
  type: accuracy
56
+ value: 0.9842315369
57
+ dataset:
58
+ name: UD Danish DDT
59
+ split: test
60
+ type: universal_dependencies
61
+ config: da_ddt
62
  - task:
63
+ name: MORPH
64
+ type: token-classification
65
+ metrics:
66
+ - name: Morph (UFeats) Accuracy
67
+ type: accuracy
68
+ value: 0.9772942762
69
+ dataset:
70
+ name: UD Danish DDT
71
+ split: test
72
+ type: universal_dependencies
73
+ config: da_ddt
74
+ - task:
75
+ name: LEMMA
76
  type: token-classification
77
  metrics:
78
+ - name: Lemma Accuracy
79
  type: accuracy
80
+ value: 0.9466699925
81
+ dataset:
82
+ name: UD Danish DDT
83
+ split: test
84
+ type: universal_dependencies
85
+ config: da_ddt
86
+ - task:
87
+ name: UNLABELED_DEPENDENCIES
88
+ type: token-classification
89
+ metrics:
90
+ - name: Unlabeled Attachment Score (UAS)
91
+ type: f_score
92
+ value: 0.8978522787
93
+ dataset:
94
+ name: UD Danish DDT
95
+ split: test
96
+ type: universal_dependencies
97
+ config: da_ddt
98
+ - task:
99
+ name: LABELED_DEPENDENCIES
100
+ type: token-classification
101
+ metrics:
102
+ - name: Labeled Attachment Score (LAS)
103
+ type: f_score
104
+ value: 0.8701623698
105
+ dataset:
106
+ name: UD Danish DDT
107
+ split: test
108
+ type: universal_dependencies
109
+ config: da_ddt
110
+ - task:
111
+ name: SENTS
112
+ type: token-classification
113
+ metrics:
114
+ - name: Sentences F-Score
115
+ type: f_score
116
+ value: 0.9433304272
117
+ dataset:
118
+ name: UD Danish DDT
119
+ split: test
120
+ type: universal_dependencies
121
+ config: da_ddt
122
+ - task:
123
+ name: coreference-resolution
124
+ type: coreference-resolution
125
+ metrics:
126
+ - name: LEA
127
+ type: f_score
128
+ value: 0.4218334451
129
+ dataset:
130
+ name: DaCoref
131
+ type: alexandrainst/dacoref
132
+ split: custom
133
+ - task:
134
+ name: coreference-resolution
135
+ type: coreference-resolution
136
+ metrics:
137
+ - name: Named entity Linking Precision
138
+ type: precision
139
+ value: 0.8461538462
140
+ - name: Named entity Linking Recall
141
+ type: recall
142
+ value: 0.2222222222
143
+ - name: Named entity Linking F Score
144
+ type: f_score
145
+ value: 0.352
146
+ dataset:
147
+ name: DaNED
148
+ type: named-entity-linking
149
+ split: custom
150
+ library_name: spacy
151
+ datasets:
152
+ - universal_dependencies
153
+ - dane
154
+ - alexandrainst/dacoref
155
+ metrics:
156
+ - accuracy
157
  ---
158
 
159
  <a href="https://github.com/centre-for-humanities-computing/Dacy"><img src="https://centre-for-humanities-computing.github.io/DaCy/_static/icon.png" width="175" height="175" align="right" /></a>
160
 
161
+ # DaCy small
 
162
 
163
  DaCy is a Danish language processing framework with state-of-the-art pipelines as well as functionality for analysing Danish pipelines.
164
+ DaCy's largest pipeline has achieved State-of-the-Art performance on parts-of-speech tagging and dependency
165
+ parsing for Danish on the Danish Dependency treebank as well as competitive performance on named entity recognition, named entity disambiguation and coreference resolution.
166
+ To read more check out the [DaCy repository](https://github.com/centre-for-humanities-computing/DaCy) for material on how to use DaCy and reproduce the results.
167
  DaCy also contains guides on usage of the package as well as behavioural test for biases and robustness of Danish NLP pipelines.
168
+
169
 
170
  | Feature | Description |
171
  | --- | --- |
172
  | **Name** | `da_dacy_small_trf` |
173
+ | **Version** | `0.2.0` |
174
+ | **spaCy** | `>=3.5.2,<3.6.0` |
175
+ | **Default Pipeline** | `transformer`, `tagger`, `morphologizer`, `trainable_lemmatizer`, `parser`, `ner`, `coref`, `span_resolver`, `span_cleaner`, `entity_linker` |
176
+ | **Components** | `transformer`, `tagger`, `morphologizer`, `trainable_lemmatizer`, `parser`, `ner`, `coref`, `span_resolver`, `span_cleaner`, `entity_linker` |
177
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
178
+ | **Sources** | [UD Danish DDT v2.11](https://github.com/UniversalDependencies/UD_Danish-DDT) (Johannsen, Anders; Martínez Alonso, Héctor; Plank, Barbara)<br />[DaNE](https://huggingface.co/datasets/dane) (Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders Søgaard)<br />[DaCoref](https://huggingface.co/datasets/alexandrainst/dacoref) (Buch-Kromann, Matthias)<br />[DaNED](https://danlp-alexandra.readthedocs.io/en/stable/docs/datasets.html#daned) (Barrett, M. J., Lam, H., Wu, M., Lacroix, O., Plank, B., & Søgaard, A.)<br />[jonfd/electra-small-nordic](https://huggingface.co/jonfd/electra-small-nordic) (Jón Friðrik Daðason) |
179
+ | **License** | `Apache-2.0` |
180
+ | **Author** | [Kenneth Enevoldsen](https://chcaa.io/#/) |
181
 
182
  ### Label Scheme
183
 
184
  <details>
185
 
186
+ <summary>View label scheme (211 labels for 4 components)</summary>
187
 
188
  | Component | Labels |
189
  | --- | --- |
190
+ | **`tagger`** | `ADJ`, `ADP`, `ADV`, `AUX`, `CCONJ`, `DET`, `INTJ`, `NOUN`, `NUM`, `PART`, `PRON`, `PROPN`, `PUNCT`, `SCONJ`, `SYM`, `VERB`, `X` |
191
+ | **`morphologizer`** | `AdpType=Prep\|POS=ADP`, `Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=AUX\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=PROPN`, `Definite=Ind\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=SCONJ`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=ADV`, `Number=Plur\|POS=DET\|PronType=Dem`, `Degree=Pos\|Number=Plur\|POS=ADJ`, `Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=PUNCT`, `NumType=Ord\|POS=ADJ`, `POS=CCONJ`, `Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `POS=VERB\|VerbForm=Inf\|Voice=Act`, `Case=Acc\|Gender=Neut\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Degree=Sup\|POS=ADV`, `Degree=Pos\|POS=ADV`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Number=Plur\|POS=DET\|PronType=Ind`, `POS=ADP`, `POS=ADV\|PartType=Inf`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Mood=Ind\|POS=AUX\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=ADP\|PartType=Inf`, `Definite=Ind\|Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `NumType=Card\|POS=NUM`, `Degree=Pos\|POS=ADJ`, `Definite=Ind\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=PART\|PartType=Inf`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Case=Nom\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `Definite=Def\|Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Acc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `POS=AUX\|VerbForm=Inf\|Voice=Act`, `Definite=Ind\|Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Definite=Ind\|Degree=Cmp\|Number=Sing\|POS=ADJ`, `Degree=Cmp\|POS=ADJ`, `POS=PRON\|PartType=Inf`, `Definite=Ind\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Nom\|Gender=Com\|POS=PRON\|PronType=Ind`, `Number=Plur\|POS=PRON\|PronType=Ind`, `POS=INTJ`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Case=Gen\|Number=Plur\|POS=DET\|PronType=Ind`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Pass`, `Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Degree=Cmp\|POS=ADV`, `Number=Plur\|Number[psor]=Plur\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Case=Gen\|POS=PROPN`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Ind`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Definite=Def\|Degree=Sup\|POS=ADJ`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Dem`, `Definite=Def\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=PRON\|PronType=Dem`, `Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Number=Plur\|POS=NUM`, `POS=VERB\|VerbForm=Inf\|Voice=Pass`, `Definite=Def\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `POS=PRON`, `Definite=Ind\|Number=Sing\|POS=NOUN`, `Definite=Ind\|Number=Sing\|POS=NUM`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Foreign=Yes\|POS=ADV`, `POS=NOUN`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Gender=Com\|Number=Plur\|POS=NOUN`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Ind`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Degree=Sup\|POS=ADJ`, `Degree=Pos\|Number=Sing\|POS=ADJ`, `Mood=Imp\|POS=VERB`, `Case=Nom\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Case=Acc\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `POS=X`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `Number=Plur\|POS=PRON\|PronType=Dem`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Number=Plur\|POS=PRON\|PronType=Int,Rel`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Degree=Cmp\|Number=Plur\|POS=ADJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Gender=Com\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Com\|POS=PRON\|PronType=Int,Rel`, `Case=Gen\|Degree=Pos\|Number=Plur\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `POS=VERB\|VerbForm=Ger`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Dem`, `Case=Gen\|POS=PRON\|PronType=Int,Rel`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Pass`, `Abbr=Yes\|POS=X`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Definite=Ind\|Number=Plur\|POS=NOUN`, `Foreign=Yes\|POS=X`, `Number=Plur\|POS=PRON\|PronType=Rcp`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Case=Gen\|Degree=Cmp\|POS=ADJ`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Dem`, `Number=Plur\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Gender=Neut\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Number=Plur\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Number=Plur\|POS=PRON\|PronType=Rcp`, `POS=DET\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `POS=SYM`, `POS=DET\|PronType=Dem`, `Gender=Com\|Number=Sing\|POS=NUM`, `Number[psor]=Plur\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Degree=Abs\|POS=ADJ`, `POS=VERB\|Tense=Pres`, `Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NUM`, `Degree=Abs\|POS=ADV`, `Case=Gen\|Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Number[psor]=Plur\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `Definite=Ind\|POS=NOUN`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Definite=Ind\|Gender=Com\|Number=Sing\|POS=NUM`, `Definite=Def\|Number=Plur\|POS=NOUN`, `Case=Gen\|POS=NOUN`, `POS=AUX\|Tense=Pres\|VerbForm=Part` |
192
+ | **`parser`** | `ROOT`, `acl:relcl`, `advcl`, `advmod`, `advmod:lmod`, `amod`, `appos`, `aux`, `case`, `cc`, `ccomp`, `compound:prt`, `conj`, `cop`, `dep`, `det`, `expl`, `fixed`, `flat`, `iobj`, `list`, `mark`, `nmod`, `nmod:poss`, `nsubj`, `nummod`, `obj`, `obl`, `obl:lmod`, `obl:tmod`, `punct`, `xcomp` |
193
  | **`ner`** | `LOC`, `MISC`, `ORG`, `PER` |
194
 
195
  </details>
 
198
 
199
  | Type | Score |
200
  | --- | --- |
201
+ | `TOKEN_ACC` | 99.92 |
202
+ | `TOKEN_P` | 99.70 |
203
+ | `TOKEN_R` | 99.77 |
204
+ | `TOKEN_F` | 99.74 |
205
+ | `SENTS_P` | 92.96 |
206
+ | `SENTS_R` | 95.75 |
207
+ | `SENTS_F` | 94.33 |
208
+ | `TAG_ACC` | 98.47 |
209
+ | `POS_ACC` | 98.42 |
210
+ | `MORPH_ACC` | 97.73 |
211
+ | `MORPH_MICRO_P` | 98.94 |
212
+ | `MORPH_MICRO_R` | 98.33 |
213
+ | `MORPH_MICRO_F` | 98.64 |
214
+ | `DEP_UAS` | 89.79 |
215
+ | `DEP_LAS` | 87.02 |
216
+ | `ENTS_P` | 83.06 |
217
+ | `ENTS_R` | 81.72 |
218
+ | `ENTS_F` | 82.38 |
219
+ | `LEMMA_ACC` | 94.67 |
220
+ | `COREF_LEA_F1` | 42.18 |
221
+ | `COREF_LEA_PRECISION` | 44.79 |
222
+ | `COREF_LEA_RECALL` | 39.86 |
223
+ | `NEL_SCORE` | 35.20 |
224
+ | `NEL_MICRO_P` | 84.62 |
225
+ | `NEL_MICRO_R` | 22.22 |
226
+ | `NEL_MICRO_F` | 35.20 |
227
+ | `NEL_MACRO_P` | 87.68 |
228
+ | `NEL_MACRO_R` | 24.76 |
229
+ | `NEL_MACRO_F` | 37.52 |
230
+
231
+
232
+
233
+ ### Training
234
+ This model was trained using [spaCy](https://spacy.io) and logged to [Weights & Biases](https://wandb.ai/kenevoldsen/dacy-v0.2.0). You can find all the training logs [here](https://wandb.ai/kenevoldsen/dacy-v0.2.0).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
config.cfg CHANGED
@@ -1,54 +1,101 @@
1
  [paths]
2
- train = "corpus/dane/train.spacy"
3
- dev = "corpus/dane/dev.spacy"
4
- vectors = null
5
- raw = null
6
  init_tok2vec = null
7
- vocab_data = null
 
8
 
9
  [system]
10
  gpu_allocator = "pytorch"
11
- seed = 1
12
 
13
  [nlp]
14
  lang = "da"
15
- pipeline = ["transformer","morphologizer","parser","attribute_ruler","lemmatizer","ner"]
 
16
  disabled = []
17
  before_creation = null
18
  after_creation = null
19
  after_pipeline_creation = null
20
- batch_size = 64
21
  tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
22
 
23
  [components]
24
 
25
- [components.attribute_ruler]
26
- factory = "attribute_ruler"
27
- validate = false
28
 
29
- [components.lemmatizer]
30
- factory = "lemmatizer"
31
- mode = "lookup"
32
- model = null
33
- overwrite = false
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
 
 
 
37
 
38
  [components.morphologizer.model]
39
- @architectures = "spacy.Tagger.v1"
40
  nO = null
 
41
 
42
  [components.morphologizer.model.tok2vec]
43
  @architectures = "spacy-transformers.TransformerListener.v1"
44
  grad_factor = 1.0
45
- upstream = "transformer"
46
  pooling = {"@layers":"reduce_mean.v1"}
 
47
 
48
  [components.ner]
49
  factory = "ner"
50
  incorrect_spans_key = null
51
  moves = null
 
52
  update_with_oracle_cut_size = 100
53
 
54
  [components.ner.model]
@@ -63,95 +110,169 @@ nO = null
63
  [components.ner.model.tok2vec]
64
  @architectures = "spacy-transformers.TransformerListener.v1"
65
  grad_factor = 1.0
66
- upstream = "transformer"
67
  pooling = {"@layers":"reduce_mean.v1"}
 
68
 
69
  [components.parser]
70
  factory = "parser"
71
  learn_tokens = false
72
  min_action_freq = 30
73
  moves = null
 
74
  update_with_oracle_cut_size = 100
75
 
76
  [components.parser.model]
77
  @architectures = "spacy.TransitionBasedParser.v2"
78
  state_type = "parser"
79
  extra_state_tokens = false
80
- hidden_width = 64
81
- maxout_pieces = 2
82
  use_upper = false
83
  nO = null
84
 
85
  [components.parser.model.tok2vec]
86
  @architectures = "spacy-transformers.TransformerListener.v1"
87
  grad_factor = 1.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88
  upstream = "transformer"
89
  pooling = {"@layers":"reduce_mean.v1"}
90
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
91
  [components.transformer]
92
  factory = "transformer"
93
  max_batch_items = 4096
94
  set_extra_annotations = {"@annotation_setters":"spacy-transformers.null_annotation_setter.v1"}
95
 
96
  [components.transformer.model]
97
- @architectures = "spacy-transformers.TransformerModel.v1"
98
- name = "Maltehb/-l-ctra-danish-electra-small-cased"
 
99
 
100
  [components.transformer.model.get_spans]
101
  @span_getters = "spacy-transformers.strided_spans.v1"
102
  window = 128
103
  stride = 96
104
 
 
 
105
  [components.transformer.model.tokenizer_config]
106
  use_fast = true
107
 
 
 
108
  [corpora]
109
 
110
  [corpora.dev]
111
  @readers = "spacy.Corpus.v1"
112
- limit = 0
113
- max_length = 0
114
- path = ${paths:dev}
115
  gold_preproc = false
 
 
116
  augmenter = null
117
 
118
  [corpora.train]
119
  @readers = "spacy.Corpus.v1"
120
- path = ${paths:train}
121
- max_length = 500
122
  gold_preproc = false
 
123
  limit = 0
124
-
125
- [corpora.train.augmenter]
126
- @augmenters = "spacy.lower_case.v1"
127
- level = 0.1
128
 
129
  [training]
130
- train_corpus = "corpora.train"
131
- dev_corpus = "corpora.dev"
132
- seed = ${system:seed}
133
- gpu_allocator = ${system:gpu_allocator}
134
  dropout = 0.1
135
- accumulate_gradient = 3
136
- patience = 5000
137
  max_epochs = 0
138
- max_steps = 40000
139
- eval_frequency = 1000
140
  frozen_components = []
141
- before_to_disk = null
142
  annotating_components = []
 
 
 
 
143
 
144
  [training.batcher]
145
- @batchers = "spacy.batch_by_padded.v1"
146
- discard_oversize = true
 
147
  get_length = null
148
- size = 2000
149
- buffer = 256
 
 
 
 
 
150
 
151
  [training.logger]
152
- @loggers = "spacy.WandbLogger.v1"
153
- project_name = "dacy-an-efficient-pipeline-for-danish"
154
- remove_config_values = []
155
 
156
  [training.optimizer]
157
  @optimizers = "Adam.v1"
@@ -160,66 +281,44 @@ beta2 = 0.999
160
  L2_is_weight_decay = true
161
  L2 = 0.01
162
  grad_clip = 1.0
163
- use_averages = true
164
  eps = 0.00000001
165
-
166
- [training.optimizer.learn_rate]
167
- @schedules = "warmup_linear.v1"
168
- warmup_steps = 250
169
- total_steps = 20000
170
- initial_rate = 0.00005
171
 
172
  [training.score_weights]
173
- pos_acc = 0.08
174
- morph_acc = 0.08
 
175
  morph_per_feat = null
176
- dep_uas = 0.0
177
- dep_las = 0.16
 
178
  dep_las_per_type = null
179
  sents_p = null
180
  sents_r = null
181
- sents_f = 0.02
182
- lemma_acc = 0.5
183
- ents_f = 0.16
184
  ents_p = 0.0
185
  ents_r = 0.0
186
  ents_per_type = null
 
 
 
 
 
 
 
187
 
188
  [pretraining]
189
 
190
  [initialize]
191
- vocab_data = ${paths.vocab_data}
192
  vectors = ${paths.vectors}
193
  init_tok2vec = ${paths.init_tok2vec}
 
 
194
  before_init = null
195
  after_init = null
196
 
197
  [initialize.components]
198
 
199
- [initialize.components.morphologizer]
200
-
201
- [initialize.components.morphologizer.labels]
202
- @readers = "spacy.read_labels.v1"
203
- path = "corpus/labels/morphologizer.json"
204
- require = false
205
-
206
- [initialize.components.ner]
207
-
208
- [initialize.components.ner.labels]
209
- @readers = "spacy.read_labels.v1"
210
- path = "corpus/labels/ner.json"
211
- require = false
212
-
213
- [initialize.components.parser]
214
-
215
- [initialize.components.parser.labels]
216
- @readers = "spacy.read_labels.v1"
217
- path = "corpus/labels/parser.json"
218
- require = false
219
-
220
- [initialize.lookups]
221
- @misc = "spacy.LookupsDataLoader.v1"
222
- lang = ${nlp.lang}
223
- tables = ["lexeme_norm"]
224
-
225
  [initialize.tokenizer]
 
1
  [paths]
2
+ train = null
3
+ dev = null
 
 
4
  init_tok2vec = null
5
+ vectors = null
6
+ model_source = "training/da_dacy_small_trf2/model-last"
7
 
8
  [system]
9
  gpu_allocator = "pytorch"
10
+ seed = 0
11
 
12
  [nlp]
13
  lang = "da"
14
+ pipeline = ["transformer","tagger","morphologizer","trainable_lemmatizer","parser","ner","coref","span_resolver","span_cleaner","entity_linker"]
15
+ batch_size = 512
16
  disabled = []
17
  before_creation = null
18
  after_creation = null
19
  after_pipeline_creation = null
 
20
  tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
21
 
22
  [components]
23
 
24
+ [components.coref]
25
+ factory = "experimental_coref"
26
+ span_cluster_prefix = "coref_head_clusters"
27
 
28
+ [components.coref.model]
29
+ @architectures = "spacy-experimental.Coref.v1"
30
+ distance_embedding_size = 20
31
+ dropout = 0.3
32
+ hidden_size = 1024
33
+ depth = 2
34
+ antecedent_limit = 100
35
+ antecedent_batch_size = 512
36
+
37
+ [components.coref.model.tok2vec]
38
+ @architectures = "spacy-transformers.TransformerListener.v1"
39
+ grad_factor = 0.5
40
+ upstream = "transformer"
41
+ pooling = {"@layers":"reduce_mean.v1"}
42
+
43
+ [components.coref.scorer]
44
+ @scorers = "spacy-experimental.coref_scorer.v1"
45
+ span_cluster_prefix = "coref_head_clusters"
46
+
47
+ [components.entity_linker]
48
+ factory = "entity_linker"
49
+ candidates_batch_size = 1
50
+ entity_vector_length = 768
51
+ generate_empty_kb = {"@misc":"spacy.EmptyKB.v2"}
52
+ get_candidates = {"@misc":"spacy.CandidateGenerator.v1"}
53
+ get_candidates_batch = {"@misc":"spacy.CandidateBatchGenerator.v1"}
54
+ incl_context = true
55
+ incl_prior = true
56
+ labels_discard = []
57
+ n_sents = 0
58
+ overwrite = true
59
+ scorer = {"@scorers":"spacy.entity_linker_scorer.v1"}
60
+ threshold = null
61
+ use_gold_ents = true
62
+
63
+ [components.entity_linker.model]
64
+ @architectures = "spacy.EntityLinker.v2"
65
+ nO = null
66
+
67
+ [components.entity_linker.model.tok2vec]
68
+ @architectures = "spacy.HashEmbedCNN.v2"
69
+ pretrained_vectors = null
70
+ width = 96
71
+ depth = 2
72
+ embed_size = 2000
73
+ window_size = 1
74
+ maxout_pieces = 3
75
+ subword_features = true
76
 
77
  [components.morphologizer]
78
  factory = "morphologizer"
79
+ extend = false
80
+ overwrite = true
81
+ scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
82
 
83
  [components.morphologizer.model]
84
+ @architectures = "spacy.Tagger.v2"
85
  nO = null
86
+ normalize = false
87
 
88
  [components.morphologizer.model.tok2vec]
89
  @architectures = "spacy-transformers.TransformerListener.v1"
90
  grad_factor = 1.0
 
91
  pooling = {"@layers":"reduce_mean.v1"}
92
+ upstream = "transformer"
93
 
94
  [components.ner]
95
  factory = "ner"
96
  incorrect_spans_key = null
97
  moves = null
98
+ scorer = {"@scorers":"spacy.ner_scorer.v1"}
99
  update_with_oracle_cut_size = 100
100
 
101
  [components.ner.model]
 
110
  [components.ner.model.tok2vec]
111
  @architectures = "spacy-transformers.TransformerListener.v1"
112
  grad_factor = 1.0
 
113
  pooling = {"@layers":"reduce_mean.v1"}
114
+ upstream = "transformer"
115
 
116
  [components.parser]
117
  factory = "parser"
118
  learn_tokens = false
119
  min_action_freq = 30
120
  moves = null
121
+ scorer = {"@scorers":"spacy.parser_scorer.v1"}
122
  update_with_oracle_cut_size = 100
123
 
124
  [components.parser.model]
125
  @architectures = "spacy.TransitionBasedParser.v2"
126
  state_type = "parser"
127
  extra_state_tokens = false
128
+ hidden_width = 128
129
+ maxout_pieces = 3
130
  use_upper = false
131
  nO = null
132
 
133
  [components.parser.model.tok2vec]
134
  @architectures = "spacy-transformers.TransformerListener.v1"
135
  grad_factor = 1.0
136
+ pooling = {"@layers":"reduce_mean.v1"}
137
+ upstream = "transformer"
138
+
139
+ [components.span_cleaner]
140
+ factory = "experimental_span_cleaner"
141
+ prefix = "coref_head_clusters"
142
+
143
+ [components.span_resolver]
144
+ factory = "experimental_span_resolver"
145
+ input_prefix = "coref_head_clusters"
146
+ output_prefix = "coref_clusters"
147
+
148
+ [components.span_resolver.model]
149
+ @architectures = "spacy-experimental.SpanResolver.v1"
150
+ hidden_size = 1024
151
+ distance_embedding_size = 64
152
+ conv_channels = 4
153
+ window_size = 1
154
+ max_distance = 128
155
+ prefix = "coref_head_clusters"
156
+
157
+ [components.span_resolver.model.tok2vec]
158
+ @architectures = "spacy-transformers.TransformerListener.v1"
159
+ grad_factor = 0.0
160
  upstream = "transformer"
161
  pooling = {"@layers":"reduce_mean.v1"}
162
 
163
+ [components.span_resolver.scorer]
164
+ @scorers = "spacy-experimental.span_resolver_scorer.v1"
165
+ input_prefix = "coref_head_clusters"
166
+ output_prefix = "coref_clusters"
167
+
168
+ [components.tagger]
169
+ factory = "tagger"
170
+ neg_prefix = "!"
171
+ overwrite = false
172
+ scorer = {"@scorers":"spacy.tagger_scorer.v1"}
173
+
174
+ [components.tagger.model]
175
+ @architectures = "spacy.Tagger.v2"
176
+ nO = null
177
+ normalize = false
178
+
179
+ [components.tagger.model.tok2vec]
180
+ @architectures = "spacy-transformers.TransformerListener.v1"
181
+ grad_factor = 1.0
182
+ pooling = {"@layers":"reduce_mean.v1"}
183
+ upstream = "transformer"
184
+
185
+ [components.trainable_lemmatizer]
186
+ factory = "trainable_lemmatizer"
187
+ backoff = "orth"
188
+ min_tree_freq = 3
189
+ overwrite = false
190
+ scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
191
+ top_k = 1
192
+
193
+ [components.trainable_lemmatizer.model]
194
+ @architectures = "spacy.Tagger.v2"
195
+ nO = null
196
+ normalize = false
197
+
198
+ [components.trainable_lemmatizer.model.tok2vec]
199
+ @architectures = "spacy-transformers.TransformerListener.v1"
200
+ grad_factor = 1.0
201
+ pooling = {"@layers":"reduce_mean.v1"}
202
+ upstream = "transformer"
203
+
204
  [components.transformer]
205
  factory = "transformer"
206
  max_batch_items = 4096
207
  set_extra_annotations = {"@annotation_setters":"spacy-transformers.null_annotation_setter.v1"}
208
 
209
  [components.transformer.model]
210
+ @architectures = "spacy-transformers.TransformerModel.v3"
211
+ name = "jonfd/electra-small-nordic"
212
+ mixed_precision = false
213
 
214
  [components.transformer.model.get_spans]
215
  @span_getters = "spacy-transformers.strided_spans.v1"
216
  window = 128
217
  stride = 96
218
 
219
+ [components.transformer.model.grad_scaler_config]
220
+
221
  [components.transformer.model.tokenizer_config]
222
  use_fast = true
223
 
224
+ [components.transformer.model.transformer_config]
225
+
226
  [corpora]
227
 
228
  [corpora.dev]
229
  @readers = "spacy.Corpus.v1"
230
+ path = ${paths.dev}
 
 
231
  gold_preproc = false
232
+ max_length = 0
233
+ limit = 0
234
  augmenter = null
235
 
236
  [corpora.train]
237
  @readers = "spacy.Corpus.v1"
238
+ path = ${paths.train}
 
239
  gold_preproc = false
240
+ max_length = 0
241
  limit = 0
242
+ augmenter = null
 
 
 
243
 
244
  [training]
245
+ seed = ${system.seed}
246
+ gpu_allocator = ${system.gpu_allocator}
 
 
247
  dropout = 0.1
248
+ accumulate_gradient = 1
249
+ patience = 1600
250
  max_epochs = 0
251
+ max_steps = 20000
252
+ eval_frequency = 200
253
  frozen_components = []
 
254
  annotating_components = []
255
+ dev_corpus = "corpora.dev"
256
+ train_corpus = "corpora.train"
257
+ before_to_disk = null
258
+ before_update = null
259
 
260
  [training.batcher]
261
+ @batchers = "spacy.batch_by_words.v1"
262
+ discard_oversize = false
263
+ tolerance = 0.2
264
  get_length = null
265
+
266
+ [training.batcher.size]
267
+ @schedules = "compounding.v1"
268
+ start = 100
269
+ stop = 1000
270
+ compound = 1.001
271
+ t = 0.0
272
 
273
  [training.logger]
274
+ @loggers = "spacy.ConsoleLogger.v1"
275
+ progress_bar = false
 
276
 
277
  [training.optimizer]
278
  @optimizers = "Adam.v1"
 
281
  L2_is_weight_decay = true
282
  L2 = 0.01
283
  grad_clip = 1.0
284
+ use_averages = false
285
  eps = 0.00000001
286
+ learn_rate = 0.001
 
 
 
 
 
287
 
288
  [training.score_weights]
289
+ tag_acc = 0.12
290
+ pos_acc = 0.06
291
+ morph_acc = 0.06
292
  morph_per_feat = null
293
+ lemma_acc = 0.12
294
+ dep_uas = 0.06
295
+ dep_las = 0.06
296
  dep_las_per_type = null
297
  sents_p = null
298
  sents_r = null
299
+ sents_f = 0.0
300
+ ents_f = 0.12
 
301
  ents_p = 0.0
302
  ents_r = 0.0
303
  ents_per_type = null
304
+ coref_f = 0.12
305
+ coref_p = null
306
+ coref_r = null
307
+ span_accuracy = 0.12
308
+ nel_micro_f = 0.12
309
+ nel_micro_r = null
310
+ nel_micro_p = null
311
 
312
  [pretraining]
313
 
314
  [initialize]
 
315
  vectors = ${paths.vectors}
316
  init_tok2vec = ${paths.init_tok2vec}
317
+ vocab_data = null
318
+ lookups = null
319
  before_init = null
320
  after_init = null
321
 
322
  [initialize.components]
323
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
324
  [initialize.tokenizer]
coref/cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "nI":256
3
+ }
transformer/model/pytorch_model.bin → coref/model RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d65643fe23c672180685635b539688406638af1f7e515cb89505ea7626127400
3
- size 54773654
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:354989d322d38a9108bca66e2df65492f2e0c8f816311f10e9ea993616948325
3
+ size 9808220
da_dacy_small_trf-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9a76f9af63a196fccfc13b6dab46ef46ac1ba1202c15ad38b7189b07ee6e62be
3
- size 57514565
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a2375d7fa15779923cc3583195eabb5276c656acd11d3ac95b645d33fba6ecc1
3
+ size 101319844
entity_linker/cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "overwrite":true
3
+ }
entity_linker/kb/contents ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6afd92b9c2f1503893028925e3cee5e6ee16e3e2d6cf3e03fddafe9edee0ea22
3
+ size 3035652
entity_linker/kb/strings.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e23b286fc3491d7a954e1d6156fb738028114856322ad8c1035b2f377762f271
3
+ size 544387
entity_linker/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3ad9a5f93c7fb173b2303509b4177f7905beb321a80ab6569154b3dd095b3a6d
3
+ size 3212918
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"da",
3
  "name":"dacy_small_trf",
4
- "version":"0.1.0",
5
- "description":"\n<a href=\"https://github.com/centre-for-humanities-computing/Dacy\"><img src=\"https://centre-for-humanities-computing.github.io/DaCy/_static/icon.png\" width=\"175\" height=\"175\" align=\"right\" /></a>\n\n# DaCy small transformer\n\nDaCy is a Danish language processing framework with state-of-the-art pipelines as well as functionality for analysing Danish pipelines.\nDaCy's largest pipeline has achieved State-of-the-Art performance on Named entity recognition, part-of-speech tagging and dependency \nparsing for Danish on the DaNE dataset. Check out the [DaCy repository](https://github.com/centre-for-humanities-computing/DaCy) for material on how to use DaCy and reproduce the results. \nDaCy also contains guides on usage of the package as well as behavioural test for biases and robustness of Danish NLP pipelines.\n ",
6
- "author":"Centre for Humanities Computing Aarhus",
7
  "email":"Kenneth.enevoldsen@cas.au.dk",
8
  "url":"https://chcaa.io/#/",
9
- "license":"Apache-2.0 License",
10
- "spacy_version":">=3.1.1,<3.2.0",
11
- "spacy_git_version":"ffaead8fe",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
@@ -18,6 +18,25 @@
18
  "labels":{
19
  "transformer":[
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  ],
22
  "morphologizer":[
23
  "AdpType=Prep|POS=ADP",
@@ -34,155 +53,157 @@
34
  "Degree=Pos|Number=Plur|POS=ADJ",
35
  "Definite=Ind|Gender=Com|Number=Plur|POS=NOUN",
36
  "POS=PUNCT",
 
37
  "POS=CCONJ",
38
- "Definite=Ind|Degree=Cmp|Number=Sing|POS=ADJ",
39
- "Degree=Cmp|POS=ADJ",
40
- "POS=PRON|PartType=Inf",
41
- "Gender=Com|Number=Sing|POS=DET|PronType=Ind",
42
- "Definite=Ind|Degree=Pos|Number=Sing|POS=ADJ",
43
- "Case=Acc|Gender=Neut|Number=Sing|POS=PRON|Person=3|PronType=Prs",
44
  "Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN",
45
- "Definite=Def|Degree=Pos|Number=Sing|POS=ADJ",
46
- "Gender=Neut|Number=Sing|POS=DET|PronType=Dem",
 
47
  "Degree=Pos|POS=ADV",
48
- "Definite=Def|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part",
49
- "Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN",
50
- "POS=PRON|PronType=Dem",
51
- "NumType=Card|POS=NUM",
52
- "Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ",
53
- "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs",
54
- "Degree=Pos|Gender=Com|Number=Sing|POS=ADJ",
55
  "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs",
56
- "NumType=Ord|POS=ADJ",
57
- "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
58
  "Mood=Ind|POS=AUX|Tense=Past|VerbForm=Fin|Voice=Act",
59
- "POS=VERB|VerbForm=Inf|Voice=Act",
 
60
  "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act",
61
- "POS=NOUN",
62
- "Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Pass",
63
  "POS=ADP|PartType=Inf",
 
 
64
  "Degree=Pos|POS=ADJ",
 
 
 
65
  "Definite=Def|Gender=Com|Number=Plur|POS=NOUN",
66
- "Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs",
 
 
 
67
  "Case=Gen|Definite=Def|Gender=Com|Number=Sing|POS=NOUN",
 
 
68
  "POS=AUX|VerbForm=Inf|Voice=Act",
69
- "Definite=Ind|Degree=Pos|Gender=Com|Number=Sing|POS=ADJ",
 
 
 
 
 
 
 
70
  "Gender=Com|Number=Sing|POS=DET|PronType=Dem",
71
- "Number=Plur|POS=DET|PronType=Ind",
72
- "Gender=Com|Number=Sing|POS=PRON|PronType=Ind",
73
- "Case=Acc|POS=PRON|Person=3|PronType=Prs|Reflex=Yes",
74
- "POS=PART|PartType=Inf",
 
 
 
 
 
 
 
 
 
75
  "Gender=Neut|Number=Sing|POS=DET|PronType=Ind",
76
- "Case=Acc|Number=Plur|POS=PRON|Person=3|PronType=Prs",
77
- "Case=Gen|Definite=Def|Gender=Neut|Number=Sing|POS=NOUN",
78
- "Case=Nom|Number=Plur|POS=PRON|Person=3|PronType=Prs",
 
 
 
 
 
 
79
  "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs",
80
- "Case=Nom|Gender=Com|POS=PRON|PronType=Ind",
81
- "Gender=Neut|Number=Sing|POS=PRON|PronType=Ind",
82
- "Mood=Imp|POS=VERB",
83
  "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs",
84
- "Definite=Ind|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part",
85
- "POS=X",
 
 
 
 
 
 
 
86
  "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs",
 
 
 
 
 
 
 
 
 
 
87
  "Case=Gen|Definite=Def|Gender=Com|Number=Plur|POS=NOUN",
88
- "POS=VERB|Tense=Pres|VerbForm=Part",
89
- "Number=Plur|POS=PRON|PronType=Int,Rel",
90
- "POS=VERB|VerbForm=Inf|Voice=Pass",
91
- "Case=Gen|Definite=Ind|Gender=Com|Number=Sing|POS=NOUN",
92
- "Degree=Cmp|POS=ADV",
93
- "POS=ADV|PartType=Inf",
94
- "Degree=Sup|POS=ADV",
95
  "Number=Plur|POS=PRON|PronType=Dem",
96
- "Number=Plur|POS=PRON|PronType=Ind",
97
- "Definite=Def|Gender=Neut|Number=Plur|POS=NOUN",
98
- "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs",
99
- "Case=Gen|POS=PROPN",
100
- "POS=ADP",
101
  "Degree=Cmp|Number=Plur|POS=ADJ",
102
- "Definite=Def|Degree=Sup|POS=ADJ",
103
- "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs",
104
- "Degree=Pos|Number=Sing|POS=ADJ",
105
- "Number=Plur|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
106
  "Gender=Com|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
107
  "Number=Plur|POS=PRON|PronType=Rcp",
 
108
  "Case=Gen|Degree=Cmp|POS=ADJ",
109
  "Case=Gen|Definite=Def|Gender=Neut|Number=Plur|POS=NOUN",
110
- "Number[psor]=Plur|POS=DET|Person=3|Poss=Yes|PronType=Prs",
111
- "POS=INTJ",
112
- "Number=Plur|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs",
113
- "Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ",
114
- "Gender=Neut|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form",
115
- "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs",
116
- "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs",
117
- "Case=Gen|Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN",
118
- "Number=Sing|POS=PRON|PronType=Int,Rel",
119
  "Number=Plur|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form",
120
- "Gender=Neut|Number=Sing|POS=PRON|PronType=Int,Rel",
121
- "Definite=Def|Degree=Sup|Number=Plur|POS=ADJ",
122
- "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs",
123
- "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
124
- "Definite=Ind|Number=Sing|POS=NOUN",
125
- "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part",
126
  "Number=Plur|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
127
- "POS=SYM",
128
- "Case=Nom|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs",
129
- "Degree=Sup|POS=ADJ",
130
- "Number=Plur|POS=DET|PronType=Ind|Style=Arch",
131
- "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Dem",
132
- "Foreign=Yes|POS=X",
133
  "POS=DET|Person=2|Polite=Form|Poss=Yes|PronType=Prs",
134
- "Gender=Neut|Number=Sing|POS=PRON|PronType=Dem",
135
- "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs",
136
- "Case=Gen|Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN",
137
- "Case=Gen|POS=PRON|PronType=Int,Rel",
138
- "Gender=Com|Number=Sing|POS=PRON|PronType=Dem",
139
- "Abbr=Yes|POS=X",
140
- "Case=Gen|Definite=Ind|Gender=Com|Number=Plur|POS=NOUN",
141
  "Definite=Def|Degree=Abs|POS=ADJ",
142
- "Definite=Ind|Degree=Sup|Number=Sing|POS=ADJ",
143
- "Definite=Ind|POS=NOUN",
144
- "Gender=Com|Number=Plur|POS=NOUN",
145
- "Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs",
146
- "Gender=Com|POS=PRON|PronType=Int,Rel",
147
- "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs",
148
  "Degree=Abs|POS=ADV",
149
- "POS=VERB|VerbForm=Ger",
150
- "POS=VERB|Tense=Past|VerbForm=Part",
151
- "Definite=Def|Degree=Sup|Number=Sing|POS=ADJ",
152
- "Number=Plur|Number[psor]=Plur|POS=PRON|Person=1|Poss=Yes|PronType=Prs|Style=Form",
153
  "Case=Gen|Definite=Def|Degree=Pos|Number=Sing|POS=ADJ",
154
- "Case=Gen|Degree=Pos|Number=Plur|POS=ADJ",
155
- "Case=Acc|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs",
156
  "Gender=Com|Number=Sing|POS=PRON|PronType=Int,Rel",
157
- "POS=VERB|Tense=Pres",
158
- "Case=Gen|Number=Plur|POS=DET|PronType=Ind",
159
- "Number[psor]=Plur|POS=DET|Person=2|Poss=Yes|PronType=Prs",
160
- "POS=PRON|Person=2|Polite=Form|Poss=Yes|PronType=Prs",
161
  "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs",
162
- "POS=AUX|Tense=Pres|VerbForm=Part",
163
- "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Pass",
164
- "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
165
- "Degree=Sup|Number=Plur|POS=ADJ",
166
- "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs",
167
- "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
168
- "Definite=Ind|Number=Plur|POS=NOUN",
169
- "Case=Gen|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part",
170
- "Mood=Imp|POS=AUX",
171
  "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=1|Poss=Yes|PronType=Prs",
172
- "Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs",
173
- "Definite=Def|Gender=Com|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part",
174
  "Number=Plur|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs",
 
 
175
  "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Ind",
 
 
176
  "Case=Gen|POS=NOUN",
177
- "Number[psor]=Plur|POS=PRON|Person=3|Poss=Yes|PronType=Prs",
178
- "POS=DET|PronType=Dem",
179
- "Definite=Def|Number=Plur|POS=NOUN"
180
  ],
181
  "parser":[
182
  "ROOT",
183
  "acl:relcl",
184
  "advcl",
185
  "advmod",
 
186
  "amod",
187
  "appos",
188
  "aux",
@@ -206,376 +227,459 @@
206
  "nummod",
207
  "obj",
208
  "obl",
209
- "obl:loc",
210
  "obl:tmod",
211
  "punct",
212
  "xcomp"
213
- ],
214
- "attribute_ruler":[
215
-
216
- ],
217
- "lemmatizer":[
218
-
219
  ],
220
  "ner":[
221
  "LOC",
222
  "MISC",
223
  "ORG",
224
  "PER"
 
 
 
 
 
 
 
 
 
225
  ]
226
  },
227
  "pipeline":[
228
  "transformer",
 
229
  "morphologizer",
 
230
  "parser",
231
- "attribute_ruler",
232
- "lemmatizer",
233
- "ner"
 
 
234
  ],
235
  "components":[
236
  "transformer",
 
237
  "morphologizer",
 
238
  "parser",
239
- "attribute_ruler",
240
- "lemmatizer",
241
- "ner"
 
 
242
  ],
243
  "disabled":[
244
 
245
  ],
246
- "_sourced_vectors_hashes":{
247
-
248
- },
 
249
  "performance":{
250
- "pos_acc":0.9583030655,
251
- "morph_acc":0.9570439246,
 
 
 
 
 
 
 
 
 
 
 
252
  "morph_per_feat":{
253
- "Mood":{
254
- "p":0.9950690335,
255
- "r":0.9618684461,
256
- "f":0.9781871062
257
- },
258
- "Tense":{
259
- "p":0.9859922179,
260
- "r":0.9540662651,
261
- "f":0.9697665519
262
  },
263
- "VerbForm":{
264
- "p":0.9823343849,
265
- "r":0.952876377,
266
- "f":0.9673811743
267
  },
268
- "Voice":{
269
- "p":0.9938414165,
270
- "r":0.9648729447,
271
- "f":0.9791429655
272
  },
273
  "Definite":{
274
- "p":0.9872480461,
275
- "r":0.9482418017,
276
- "f":0.9673518742
277
  },
278
  "Gender":{
279
- "p":0.9793956044,
280
- "r":0.9478231971,
281
- "f":0.9633507853
282
  },
283
- "Number":{
284
- "p":0.985179197,
285
- "r":0.9535732916,
286
- "f":0.9691186216
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
287
  },
288
  "AdpType":{
289
  "p":1.0,
290
- "r":0.9752431477,
291
- "f":0.9874664279
292
  },
293
- "PartType":{
294
- "p":1.0,
295
- "r":0.9675324675,
296
- "f":0.9834983498
297
  },
298
  "Case":{
299
- "p":0.9934640523,
300
- "r":0.9605055292,
301
- "f":0.9767068273
302
  },
303
  "Person":{
304
- "p":0.9908925319,
305
- "r":0.9662522202,
306
- "f":0.9784172662
307
  },
308
- "PronType":{
309
- "p":0.9941077441,
310
- "r":0.9712171053,
311
- "f":0.9825291181
312
- },
313
- "NumType":{
314
- "p":0.9791666667,
315
- "r":0.9337748344,
316
- "f":0.9559322034
317
  },
318
- "Degree":{
319
- "p":0.9726708075,
320
- "r":0.943373494,
321
- "f":0.9577981651
322
  },
323
- "Reflex":{
324
  "p":1.0,
325
  "r":1.0,
326
  "f":1.0
327
  },
328
- "Number[psor]":{
329
- "p":1.0,
330
- "r":0.988372093,
331
- "f":0.9941520468
332
- },
333
- "Poss":{
334
  "p":1.0,
335
- "r":0.9772727273,
336
- "f":0.9885057471
337
  },
338
  "Foreign":{
339
- "p":0.8888888889,
340
- "r":0.8,
341
- "f":0.8421052632
342
  },
343
  "Abbr":{
344
- "p":1.0,
345
- "r":0.4,
346
- "f":0.5714285714
347
  },
348
  "Style":{
349
  "p":1.0,
350
- "r":1.0,
351
- "f":1.0
352
  },
353
  "Polite":{
354
- "p":0.3333333333,
355
- "r":0.25,
356
- "f":0.2857142857
357
  }
358
  },
359
- "dep_uas":0.8492442546,
360
- "dep_las":0.8176199573,
361
  "dep_las_per_type":{
362
- "advmod":{
363
- "p":0.7724637681,
364
- "r":0.7528248588,
365
- "f":0.7625178827
366
  },
367
- "root":{
368
- "p":0.8561403509,
369
- "r":0.865248227,
370
- "f":0.860670194
371
  },
372
- "nsubj":{
373
- "p":0.8939393939,
374
- "r":0.8713080169,
375
- "f":0.8824786325
376
  },
377
- "case":{
378
- "p":0.9141414141,
379
- "r":0.8942687747,
380
- "f":0.9040959041
381
  },
382
- "obl":{
383
- "p":0.7286585366,
384
- "r":0.7433903577,
385
- "f":0.7359507313
386
  },
387
  "cc":{
388
- "p":0.8486646884,
389
- "r":0.8313953488,
390
- "f":0.8399412628
391
  },
392
  "conj":{
393
- "p":0.671957672,
394
- "r":0.6773333333,
395
- "f":0.6746347942
396
  },
397
- "obj":{
398
- "p":0.8560747664,
399
- "r":0.8893203883,
400
- "f":0.8723809524
401
  },
402
- "aux":{
403
- "p":0.8885542169,
404
- "r":0.860058309,
405
- "f":0.8740740741
406
  },
407
- "acl:relcl":{
408
- "p":0.6936416185,
409
- "r":0.6486486486,
410
- "f":0.6703910615
411
  },
412
- "obl:loc":{
413
- "p":0.7222222222,
414
- "r":0.7428571429,
415
- "f":0.7323943662
416
  },
417
- "det":{
418
- "p":0.9346733668,
419
- "r":0.9192751236,
420
- "f":0.926910299
421
  },
422
- "amod":{
423
- "p":0.8549488055,
424
- "r":0.8549488055,
425
- "f":0.8549488055
426
  },
427
- "nmod:poss":{
428
- "p":0.75,
429
- "r":0.7128712871,
430
- "f":0.730964467
431
  },
432
- "ccomp":{
433
- "p":0.6885245902,
434
- "r":0.6774193548,
435
- "f":0.6829268293
436
  },
437
- "nummod":{
438
- "p":0.8181818182,
439
- "r":0.825,
440
- "f":0.8215767635
441
  },
442
- "flat":{
443
- "p":0.8636363636,
444
- "r":0.880794702,
445
- "f":0.8721311475
446
  },
447
- "compound:prt":{
448
- "p":0.6551724138,
449
- "r":0.4634146341,
450
- "f":0.5428571429
451
  },
452
  "advcl":{
453
- "p":0.6967213115,
454
- "r":0.7327586207,
455
- "f":0.7142857143
456
- },
457
- "mark":{
458
- "p":0.9018789144,
459
- "r":0.887063655,
460
- "f":0.8944099379
461
  },
462
  "cop":{
463
- "p":0.8514285714,
464
- "r":0.8514285714,
465
- "f":0.8514285714
 
 
 
 
 
 
 
 
 
 
466
  },
467
  "dep":{
468
- "p":0.1960784314,
469
- "r":0.3773584906,
470
- "f":0.2580645161
471
  },
472
- "nmod":{
473
- "p":0.7197452229,
474
- "r":0.662109375,
475
- "f":0.6897253306
476
  },
477
  "iobj":{
478
- "p":0.7333333333,
479
- "r":0.5,
480
- "f":0.5945945946
 
 
 
 
 
481
  },
482
  "xcomp":{
483
- "p":0.6315789474,
484
- "r":0.406779661,
485
- "f":0.4948453608
 
 
 
 
 
 
 
 
 
 
486
  },
487
  "list":{
488
- "p":0.3636363636,
489
- "r":0.2222222222,
490
- "f":0.275862069
491
  },
492
- "vocative":{
 
 
 
 
 
493
  "p":0.0,
494
  "r":0.0,
495
  "f":0.0
496
  },
497
- "fixed":{
498
- "p":0.8947368421,
499
- "r":0.8095238095,
500
- "f":0.85
501
  },
502
- "expl":{
503
- "p":0.9090909091,
504
- "r":0.8823529412,
505
- "f":0.8955223881
506
  },
507
- "appos":{
508
- "p":0.6097560976,
509
- "r":0.7575757576,
510
- "f":0.6756756757
511
  },
512
- "obl:tmod":{
513
- "p":0.8,
514
- "r":0.2222222222,
515
- "f":0.347826087
516
  },
517
- "discourse":{
518
  "p":0.0,
519
  "r":0.0,
520
  "f":0.0
521
  }
522
  },
523
- "sents_p":0.8603839442,
524
- "sents_r":0.8741134752,
525
- "sents_f":0.8671943712,
526
- "lemma_acc":0.8491041162,
527
- "ents_f":0.8231644261,
528
- "ents_p":0.81724846,
529
- "ents_r":0.8291666667,
530
  "ents_per_type":{
531
- "PER":{
532
- "p":0.9290322581,
533
- "r":0.8674698795,
534
- "f":0.8971962617
535
  },
536
  "ORG":{
537
- "p":0.7619047619,
538
- "r":0.7111111111,
539
- "f":0.7356321839
 
 
 
 
 
540
  },
541
  "MISC":{
542
- "p":0.6739130435,
543
- "r":0.8230088496,
544
- "f":0.7410358566
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
545
  },
546
  "LOC":{
547
- "p":0.8818181818,
548
- "r":0.8738738739,
549
- "f":0.8778280543
 
 
 
 
 
550
  }
551
- },
552
- "transformer_loss":417466.8663170633,
553
- "morphologizer_loss":34589.6649030063,
554
- "parser_loss":151048.9837691551,
555
- "ner_loss":5460.9844742843
556
  },
557
  "sources":[
558
  {
559
- "name":"UD Danish DDT v2.5",
560
  "url":"https://github.com/UniversalDependencies/UD_Danish-DDT",
561
  "license":"CC BY-SA 4.0",
562
  "author":"Johannsen, Anders; Mart\u00ednez Alonso, H\u00e9ctor; Plank, Barbara"
563
  },
564
  {
565
  "name":"DaNE",
566
- "url":"https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane",
567
  "license":"CC BY-SA 4.0",
568
  "author":"Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders S\u00f8gaard"
569
  },
570
  {
571
- "name":"Maltehb/-l-ctra-danish-electra-small-cased",
572
- "author":"Malte H\u00f8jmark-Bertelsen",
573
- "url":"https://huggingface.co/Maltehb/-l-ctra-danish-electra-small-cased",
 
 
 
 
 
 
 
 
 
 
 
 
574
  "license":"CC BY 4.0"
575
  }
576
  ],
577
- "requirements":[
578
- "spacy-transformers>=1.0.3,<1.1.0"
579
- ],
580
- "notes":"\n## Bias and Robustness\n\nBesides the validation done by SpaCy on the DaNE testset, DaCy also provides a series of augmentations to the DaNE test set to see how well the models deal with these types of augmentations.\nThe can be seen as behavioural probes akinn to the NLP checklist.\n\n### Deterministic Augmentations\nDeterministic augmentations are augmentation which always yield the same result.\n\n| Augmentation | Part-of-speech tagging (Accuracy) | Morphological tagging (Accuracy) | Dependency Parsing (UAS) | Dependency Parsing (LAS) |\u00a0Sentence segmentation (F1) | Lemmatization (Accuracy) | Named entity recognition (F1) |\n| --- | --- | --- | --- | --- | --- | --- | --- |\n| No augmentation | 0.98 | 0.974 | 0.868 | 0.836 | 0.936 | 0.844 | 0.765 |\n| \u00c6\u00f8\u00e5 Augmentation | 0.955 | 0.948 | 0.823 | 0.783 | 0.922 | 0.754 | 0.718 |\n| Lowercase | 0.974 | 0.97 | 0.862 | 0.828 | 0.905 | 0.848 | 0.681 |\n| No Spacing | 0.229 | 0.229 | 0.004 | 0.003 | 0.824 | 0.225 | 0.048 |\n| Abbreviated first names | 0.979 | 0.973 | 0.864 | 0.832 | 0.94 | 0.845 | 0.699 |\n| Input size augmentation 5 sentences | 0.956 | 0.956 | 0.851 | 0.818 | 0.883 | 0.844 | 0.743 |\n| Input size augmentation 10 sentences | 0.959 | 0.958 | 0.853 | 0.821 | 0.897 | 0.844 | 0.755 |\n\n\n\n### Stochastic Augmentations\nStochastic augmentations are augmentation which are repeated mulitple times to estimate the effect of the augmentation.\n\n| Augmentation | Part-of-speech tagging (Accuracy) | Morphological tagging (Accuracy) | Dependency Parsing (UAS) | Dependency Parsing (LAS) |\u00a0Sentence segmentation (F1) | Lemmatization (Accuracy) | Named entity recognition (F1) |\n| --- | --- | --- | --- | --- | --- | --- | --- |\n| Keystroke errors 2% | 0.931 (0.003) | 0.929 (0.003) | 0.797 (0.003) | 0.753 (0.003) | 0.884 (0.003) | 0.772 (0.003) | 0.657 (0.003) |\n| Keystroke errors 5% | 0.859 (0.003) | 0.863 (0.003) | 0.699 (0.003) | 0.641 (0.003) | 0.824 (0.003) | 0.681 (0.003) | 0.53 (0.003) |\n| Keystroke errors 15% | 0.633 (0.006) | 0.662 (0.006) | 0.439 (0.006) | 0.358 (0.006) | 0.688 (0.006) | 0.459 (0.006) | 0.293 (0.006) |\n| Danish names | 0.979 (0.0) | 0.974 (0.0) | 0.867 (0.0) | 0.835 (0.0) | 0.943 (0.0) | 0.847 (0.0) | 0.748 (0.0) |\n| Muslim names | 0.979 (0.0) | 0.974 (0.0) | 0.865 (0.0) | 0.833 (0.0) | 0.94 (0.0) | 0.847 (0.0) | 0.732 (0.0) |\n| Female names | 0.979 (0.0) | 0.974 (0.0) | 0.867 (0.0) | 0.835 (0.0) | 0.946 (0.0) | 0.847 (0.0) | 0.754 (0.0) |\n| Male names | 0.979 (0.0) | 0.974 (0.0) | 0.867 (0.0) | 0.835 (0.0) | 0.943 (0.0) | 0.847 (0.0) | 0.748 (0.0) |\n| Spacing Augmention 5% | 0.941 (0.002) | 0.936 (0.002) | 0.755 (0.002) | 0.725 (0.002) | 0.907 (0.002) | 0.811 (0.002) | 0.699 (0.002) |\n\n<details>\n\n<summary> Description of Augmenters </summary>\n\n \n\n**No augmentation:**\nApplies no augmentation to the DaNE test set.\n\n**\u00c6\u00f8\u00e5 Augmentation:**\nThis augmentation replace the \u00e6,\u00f8, and \u00e5 with their spelling variations ae, oe and aa respectively.\n\n**Lowercase:**\nThis augmentation lowercases all text.\n\n**No Spacing:**\nThis augmentation removed all spacing from the text.\n\n**Abbreviated first names:**\nThis agmentation abbreviates the first names of entities. For instance 'Kenneth Enevoldsen' would turn to 'K. Enevoldsen'.\n\n**Keystroke errors 2%:**\nThis agmentation simulate keystroke errors by replacing 2% of keys with a neighbouring key on a Danish QWERTY keyboard. As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Keystroke errors 5%:**\nThis agmentation simulate keystroke errors by replacing 5% of keys with a neighbouring key on a Danish QWERTY keyboard. As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Keystroke errors 15%:**\nThis agmentation simulate keystroke errors by replacing 15% of keys with a neighbouring key on a Danish QWERTY keyboard. As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Danish names:**\nThis agmentation replace all names with Danish names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Muslim names:**\nThis agmentation replace all names with Muslim names derived from Meldgaard (2005). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Female names:**\nThis agmentation replace all names with Danish female names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Male names:**\nThis agmentation replace all names with Danish male names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Spacing Augmention 5%:**\nThis agmentation replace all names with Danish male names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n </details> \n <br /> \n\n\n### Hardware\nThis was run an trained on a Quadro RTX 8000 GPU."
581
  }
 
1
  {
2
  "lang":"da",
3
  "name":"dacy_small_trf",
4
+ "version":"0.2.0",
5
+ "description":"\n<a href=\"https://github.com/centre-for-humanities-computing/Dacy\"><img src=\"https://centre-for-humanities-computing.github.io/DaCy/_static/icon.png\" width=\"175\" height=\"175\" align=\"right\" /></a>\n\n# DaCy small\n\nDaCy is a Danish language processing framework with state-of-the-art pipelines as well as functionality for analysing Danish pipelines.\nDaCy's largest pipeline has achieved State-of-the-Art performance on parts-of-speech tagging and dependency \nparsing for Danish on the Danish Dependency treebank as well as competitive performance on named entity recognition, named entity disambiguation and coreference resolution. \nTo read more check out the [DaCy repository](https://github.com/centre-for-humanities-computing/DaCy) for material on how to use DaCy and reproduce the results. \nDaCy also contains guides on usage of the package as well as behavioural test for biases and robustness of Danish NLP pipelines.\n",
6
+ "author":"Kenneth Enevoldsen",
7
  "email":"Kenneth.enevoldsen@cas.au.dk",
8
  "url":"https://chcaa.io/#/",
9
+ "license":"Apache-2.0",
10
+ "spacy_version":">=3.5.2,<3.6.0",
11
+ "spacy_git_version":"Unknown",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
 
18
  "labels":{
19
  "transformer":[
20
 
21
+ ],
22
+ "tagger":[
23
+ "ADJ",
24
+ "ADP",
25
+ "ADV",
26
+ "AUX",
27
+ "CCONJ",
28
+ "DET",
29
+ "INTJ",
30
+ "NOUN",
31
+ "NUM",
32
+ "PART",
33
+ "PRON",
34
+ "PROPN",
35
+ "PUNCT",
36
+ "SCONJ",
37
+ "SYM",
38
+ "VERB",
39
+ "X"
40
  ],
41
  "morphologizer":[
42
  "AdpType=Prep|POS=ADP",
 
53
  "Degree=Pos|Number=Plur|POS=ADJ",
54
  "Definite=Ind|Gender=Com|Number=Plur|POS=NOUN",
55
  "POS=PUNCT",
56
+ "NumType=Ord|POS=ADJ",
57
  "POS=CCONJ",
 
 
 
 
 
 
58
  "Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN",
59
+ "POS=VERB|VerbForm=Inf|Voice=Act",
60
+ "Case=Acc|Gender=Neut|Number=Sing|POS=PRON|Person=3|PronType=Prs",
61
+ "Degree=Sup|POS=ADV",
62
  "Degree=Pos|POS=ADV",
63
+ "Gender=Com|Number=Sing|POS=DET|PronType=Ind",
64
+ "Number=Plur|POS=DET|PronType=Ind",
65
+ "POS=ADP",
66
+ "POS=ADV|PartType=Inf",
 
 
 
67
  "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs",
 
 
68
  "Mood=Ind|POS=AUX|Tense=Past|VerbForm=Fin|Voice=Act",
69
+ "Definite=Def|Degree=Pos|Number=Sing|POS=ADJ",
70
+ "Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs",
71
  "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act",
 
 
72
  "POS=ADP|PartType=Inf",
73
+ "Definite=Ind|Degree=Pos|Gender=Com|Number=Sing|POS=ADJ",
74
+ "NumType=Card|POS=NUM",
75
  "Degree=Pos|POS=ADJ",
76
+ "Definite=Ind|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part",
77
+ "POS=PART|PartType=Inf",
78
+ "Case=Acc|POS=PRON|Person=3|PronType=Prs|Reflex=Yes",
79
  "Definite=Def|Gender=Com|Number=Plur|POS=NOUN",
80
+ "Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN",
81
+ "Number[psor]=Plur|POS=DET|Person=3|Poss=Yes|PronType=Prs",
82
+ "POS=VERB|Tense=Pres|VerbForm=Part",
83
+ "Case=Nom|Number=Plur|POS=PRON|Person=3|PronType=Prs",
84
  "Case=Gen|Definite=Def|Gender=Com|Number=Sing|POS=NOUN",
85
+ "Definite=Def|Degree=Sup|Number=Plur|POS=ADJ",
86
+ "Case=Acc|Number=Plur|POS=PRON|Person=3|PronType=Prs",
87
  "POS=AUX|VerbForm=Inf|Voice=Act",
88
+ "Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ",
89
+ "Definite=Ind|Degree=Cmp|Number=Sing|POS=ADJ",
90
+ "Degree=Cmp|POS=ADJ",
91
+ "POS=PRON|PartType=Inf",
92
+ "Definite=Ind|Degree=Pos|Number=Sing|POS=ADJ",
93
+ "Case=Nom|Gender=Com|POS=PRON|PronType=Ind",
94
+ "Number=Plur|POS=PRON|PronType=Ind",
95
+ "POS=INTJ",
96
  "Gender=Com|Number=Sing|POS=DET|PronType=Dem",
97
+ "Case=Gen|Number=Plur|POS=DET|PronType=Ind",
98
+ "Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Pass",
99
+ "Definite=Def|Gender=Neut|Number=Plur|POS=NOUN",
100
+ "Degree=Cmp|POS=ADV",
101
+ "Number=Plur|Number[psor]=Plur|POS=PRON|Person=1|Poss=Yes|PronType=Prs|Style=Form",
102
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs",
103
+ "Number=Plur|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
104
+ "Case=Gen|POS=PROPN",
105
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Ind",
106
+ "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part",
107
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
108
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs",
109
+ "Definite=Def|Degree=Sup|POS=ADJ",
110
  "Gender=Neut|Number=Sing|POS=DET|PronType=Ind",
111
+ "Case=Gen|Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN",
112
+ "Gender=Neut|Number=Sing|POS=DET|PronType=Dem",
113
+ "Definite=Def|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part",
114
+ "POS=PRON|PronType=Dem",
115
+ "Degree=Pos|Gender=Com|Number=Sing|POS=ADJ",
116
+ "Number=Plur|POS=NUM",
117
+ "POS=VERB|VerbForm=Inf|Voice=Pass",
118
+ "Definite=Def|Degree=Sup|Number=Sing|POS=ADJ",
119
+ "Number=Sing|POS=PRON|PronType=Int,Rel",
120
  "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs",
121
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs",
 
 
122
  "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs",
123
+ "POS=PRON",
124
+ "Definite=Ind|Number=Sing|POS=NOUN",
125
+ "Definite=Ind|Number=Sing|POS=NUM",
126
+ "Case=Gen|Definite=Ind|Gender=Com|Number=Sing|POS=NOUN",
127
+ "Foreign=Yes|POS=ADV",
128
+ "POS=NOUN",
129
+ "Case=Gen|Definite=Def|Gender=Neut|Number=Sing|POS=NOUN",
130
+ "Gender=Com|Number=Plur|POS=NOUN",
131
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Int,Rel",
132
  "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs",
133
+ "Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs",
134
+ "Gender=Com|Number=Sing|POS=PRON|PronType=Ind",
135
+ "Case=Gen|Definite=Ind|Gender=Com|Number=Plur|POS=NOUN",
136
+ "Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ",
137
+ "Degree=Sup|POS=ADJ",
138
+ "Degree=Pos|Number=Sing|POS=ADJ",
139
+ "Mood=Imp|POS=VERB",
140
+ "Case=Nom|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs",
141
+ "Case=Acc|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs",
142
+ "POS=X",
143
  "Case=Gen|Definite=Def|Gender=Com|Number=Plur|POS=NOUN",
 
 
 
 
 
 
 
144
  "Number=Plur|POS=PRON|PronType=Dem",
145
+ "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs",
146
+ "Number=Plur|POS=PRON|PronType=Int,Rel",
147
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
 
 
148
  "Degree=Cmp|Number=Plur|POS=ADJ",
149
+ "Number=Plur|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs",
 
 
 
150
  "Gender=Com|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form",
151
+ "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs",
152
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs",
153
+ "Gender=Com|POS=PRON|PronType=Int,Rel",
154
+ "Case=Gen|Degree=Pos|Number=Plur|POS=ADJ",
155
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
156
+ "POS=VERB|VerbForm=Ger",
157
+ "Gender=Com|Number=Sing|POS=PRON|PronType=Dem",
158
+ "Case=Gen|POS=PRON|PronType=Int,Rel",
159
+ "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Pass",
160
+ "Abbr=Yes|POS=X",
161
+ "Case=Gen|Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN",
162
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs",
163
+ "Definite=Ind|Number=Plur|POS=NOUN",
164
+ "Foreign=Yes|POS=X",
165
  "Number=Plur|POS=PRON|PronType=Rcp",
166
+ "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs",
167
  "Case=Gen|Degree=Cmp|POS=ADJ",
168
  "Case=Gen|Definite=Def|Gender=Neut|Number=Plur|POS=NOUN",
169
+ "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs",
170
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Dem",
 
 
 
 
 
 
 
171
  "Number=Plur|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form",
172
+ "Gender=Neut|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form",
 
 
 
 
 
173
  "Number=Plur|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
174
+ "Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs",
175
+ "Case=Gen|Number=Plur|POS=PRON|PronType=Rcp",
 
 
 
 
176
  "POS=DET|Person=2|Polite=Form|Poss=Yes|PronType=Prs",
177
+ "POS=SYM",
178
+ "POS=DET|PronType=Dem",
179
+ "Gender=Com|Number=Sing|POS=NUM",
180
+ "Number[psor]=Plur|POS=DET|Person=2|Poss=Yes|PronType=Prs",
181
+ "Case=Gen|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part",
 
 
182
  "Definite=Def|Degree=Abs|POS=ADJ",
183
+ "POS=VERB|Tense=Pres",
184
+ "Definite=Ind|Gender=Neut|Number=Sing|POS=NUM",
 
 
 
 
185
  "Degree=Abs|POS=ADV",
 
 
 
 
186
  "Case=Gen|Definite=Def|Degree=Pos|Number=Sing|POS=ADJ",
 
 
187
  "Gender=Com|Number=Sing|POS=PRON|PronType=Int,Rel",
188
+ "POS=VERB|Tense=Past|VerbForm=Part",
189
+ "Definite=Ind|Degree=Sup|Number=Sing|POS=ADJ",
 
 
190
  "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs",
 
 
 
 
 
 
 
 
 
191
  "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=1|Poss=Yes|PronType=Prs",
 
 
192
  "Number=Plur|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs",
193
+ "Number[psor]=Plur|POS=PRON|Person=3|Poss=Yes|PronType=Prs",
194
+ "Definite=Ind|POS=NOUN",
195
  "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Ind",
196
+ "Definite=Ind|Gender=Com|Number=Sing|POS=NUM",
197
+ "Definite=Def|Number=Plur|POS=NOUN",
198
  "Case=Gen|POS=NOUN",
199
+ "POS=AUX|Tense=Pres|VerbForm=Part"
 
 
200
  ],
201
  "parser":[
202
  "ROOT",
203
  "acl:relcl",
204
  "advcl",
205
  "advmod",
206
+ "advmod:lmod",
207
  "amod",
208
  "appos",
209
  "aux",
 
227
  "nummod",
228
  "obj",
229
  "obl",
230
+ "obl:lmod",
231
  "obl:tmod",
232
  "punct",
233
  "xcomp"
 
 
 
 
 
 
234
  ],
235
  "ner":[
236
  "LOC",
237
  "MISC",
238
  "ORG",
239
  "PER"
240
+ ],
241
+ "coref":[
242
+
243
+ ],
244
+ "span_resolver":[
245
+
246
+ ],
247
+ "entity_linker":[
248
+
249
  ]
250
  },
251
  "pipeline":[
252
  "transformer",
253
+ "tagger",
254
  "morphologizer",
255
+ "trainable_lemmatizer",
256
  "parser",
257
+ "ner",
258
+ "coref",
259
+ "span_resolver",
260
+ "span_cleaner",
261
+ "entity_linker"
262
  ],
263
  "components":[
264
  "transformer",
265
+ "tagger",
266
  "morphologizer",
267
+ "trainable_lemmatizer",
268
  "parser",
269
+ "ner",
270
+ "coref",
271
+ "span_resolver",
272
+ "span_cleaner",
273
+ "entity_linker"
274
  ],
275
  "disabled":[
276
 
277
  ],
278
+ "requirements":[
279
+ "spacy-experimental>=0.6.2,<0.7.0",
280
+ "spacy-transformers>=1.2.3,<1.3.0"
281
+ ],
282
  "performance":{
283
+ "token_acc":0.9992023928,
284
+ "token_p":0.9970089731,
285
+ "token_r":0.9977052779,
286
+ "token_f":0.9973570039,
287
+ "sents_p":0.9295532646,
288
+ "sents_r":0.9575221239,
289
+ "sents_f":0.9433304272,
290
+ "tag_acc":0.9846798742,
291
+ "pos_acc":0.9842315369,
292
+ "morph_acc":0.9772942762,
293
+ "morph_micro_p":0.9894326733,
294
+ "morph_micro_r":0.9833448258,
295
+ "morph_micro_f":0.9863793562,
296
  "morph_per_feat":{
297
+ "NumType":{
298
+ "p":0.9941176471,
299
+ "r":0.9825581395,
300
+ "f":0.9883040936
 
 
 
 
 
301
  },
302
+ "Degree":{
303
+ "p":0.9791666667,
304
+ "r":0.9715762274,
305
+ "f":0.9753566796
306
  },
307
+ "Number":{
308
+ "p":0.9824362606,
309
+ "r":0.9774520857,
310
+ "f":0.9799378355
311
  },
312
  "Definite":{
313
+ "p":0.9870410367,
314
+ "r":0.9777492512,
315
+ "f":0.9823731728
316
  },
317
  "Gender":{
318
+ "p":0.9781150724,
319
+ "r":0.9712583246,
320
+ "f":0.9746746395
321
  },
322
+ "Mood":{
323
+ "p":0.9990366089,
324
+ "r":0.9952015355,
325
+ "f":0.9971153846
326
+ },
327
+ "Tense":{
328
+ "p":0.9960784314,
329
+ "r":0.9898674981,
330
+ "f":0.9929632525
331
+ },
332
+ "VerbForm":{
333
+ "p":0.9968454259,
334
+ "r":0.991217064,
335
+ "f":0.9940232778
336
+ },
337
+ "Voice":{
338
+ "p":0.999251497,
339
+ "r":0.9955257271,
340
+ "f":0.9973851326
341
  },
342
  "AdpType":{
343
  "p":1.0,
344
+ "r":0.9953531599,
345
+ "f":0.9976711691
346
  },
347
+ "PronType":{
348
+ "p":0.9936708861,
349
+ "r":0.9918772563,
350
+ "f":0.9927732611
351
  },
352
  "Case":{
353
+ "p":0.9984350548,
354
+ "r":0.9891472868,
355
+ "f":0.9937694704
356
  },
357
  "Person":{
358
+ "p":0.9965095986,
359
+ "r":0.9896013865,
360
+ "f":0.9930434783
361
  },
362
+ "Number[psor]":{
363
+ "p":0.9875,
364
+ "r":0.975308642,
365
+ "f":0.9813664596
 
 
 
 
 
366
  },
367
+ "Poss":{
368
+ "p":1.0,
369
+ "r":0.987654321,
370
+ "f":0.9937888199
371
  },
372
+ "PartType":{
373
  "p":1.0,
374
  "r":1.0,
375
  "f":1.0
376
  },
377
+ "Reflex":{
 
 
 
 
 
378
  "p":1.0,
379
+ "r":1.0,
380
+ "f":1.0
381
  },
382
  "Foreign":{
383
+ "p":0.5,
384
+ "r":0.4,
385
+ "f":0.4444444444
386
  },
387
  "Abbr":{
388
+ "p":0.3333333333,
389
+ "r":0.5,
390
+ "f":0.4
391
  },
392
  "Style":{
393
  "p":1.0,
394
+ "r":0.5,
395
+ "f":0.6666666667
396
  },
397
  "Polite":{
398
+ "p":1.0,
399
+ "r":0.6666666667,
400
+ "f":0.8
401
  }
402
  },
403
+ "dep_uas":0.8978522787,
404
+ "dep_las":0.8701623698,
405
  "dep_las_per_type":{
406
+ "nummod":{
407
+ "p":0.8070175439,
408
+ "r":0.814159292,
409
+ "f":0.8105726872
410
  },
411
+ "amod":{
412
+ "p":0.8970588235,
413
+ "r":0.895412844,
414
+ "f":0.8962350781
415
  },
416
+ "nmod":{
417
+ "p":0.7772727273,
418
+ "r":0.7467248908,
419
+ "f":0.7616926503
420
  },
421
+ "nsubj":{
422
+ "p":0.9386243386,
423
+ "r":0.9386243386,
424
+ "f":0.9386243386
425
  },
426
+ "flat":{
427
+ "p":0.9319371728,
428
+ "r":0.9468085106,
429
+ "f":0.9393139842
430
  },
431
  "cc":{
432
+ "p":0.8813559322,
433
+ "r":0.8609271523,
434
+ "f":0.8710217755
435
  },
436
  "conj":{
437
+ "p":0.8392857143,
438
+ "r":0.8150289017,
439
+ "f":0.8269794721
440
  },
441
+ "root":{
442
+ "p":0.8807495741,
443
+ "r":0.9150442478,
444
+ "f":0.8975694444
445
  },
446
+ "advmod":{
447
+ "p":0.8590704648,
448
+ "r":0.8590704648,
449
+ "f":0.8590704648
450
  },
451
+ "mark":{
452
+ "p":0.9280898876,
453
+ "r":0.9198218263,
454
+ "f":0.9239373602
455
  },
456
+ "aux":{
457
+ "p":0.9813084112,
458
+ "r":0.9692307692,
459
+ "f":0.9752321981
460
  },
461
+ "ccomp":{
462
+ "p":0.7411764706,
463
+ "r":0.7974683544,
464
+ "f":0.7682926829
465
  },
466
+ "case":{
467
+ "p":0.9367631297,
468
+ "r":0.9171038825,
469
+ "f":0.9268292683
470
  },
471
+ "det":{
472
+ "p":0.9388560158,
473
+ "r":0.9596774194,
474
+ "f":0.9491525424
475
  },
476
+ "obl":{
477
+ "p":0.8076923077,
478
+ "r":0.7987321712,
479
+ "f":0.803187251
480
  },
481
+ "appos":{
482
+ "p":0.7352941176,
483
+ "r":0.6578947368,
484
+ "f":0.6944444444
485
  },
486
+ "nmod:poss":{
487
+ "p":0.8113207547,
488
+ "r":0.7889908257,
489
+ "f":0.8
490
  },
491
+ "obj":{
492
+ "p":0.8905380334,
493
+ "r":0.9142857143,
494
+ "f":0.9022556391
495
  },
496
  "advcl":{
497
+ "p":0.7763157895,
498
+ "r":0.7564102564,
499
+ "f":0.7662337662
 
 
 
 
 
500
  },
501
  "cop":{
502
+ "p":0.875,
503
+ "r":0.8588957055,
504
+ "f":0.866873065
505
+ },
506
+ "acl:relcl":{
507
+ "p":0.7666666667,
508
+ "r":0.7540983607,
509
+ "f":0.7603305785
510
+ },
511
+ "compound:prt":{
512
+ "p":0.5,
513
+ "r":0.6176470588,
514
+ "f":0.5526315789
515
  },
516
  "dep":{
517
+ "p":0.0892857143,
518
+ "r":0.3333333333,
519
+ "f":0.1408450704
520
  },
521
+ "fixed":{
522
+ "p":0.9310344828,
523
+ "r":0.8709677419,
524
+ "f":0.9
525
  },
526
  "iobj":{
527
+ "p":0.7857142857,
528
+ "r":0.7333333333,
529
+ "f":0.7586206897
530
+ },
531
+ "obl:tmod":{
532
+ "p":0.4285714286,
533
+ "r":0.1875,
534
+ "f":0.2608695652
535
  },
536
  "xcomp":{
537
+ "p":0.7894736842,
538
+ "r":0.703125,
539
+ "f":0.7438016529
540
+ },
541
+ "advmod:lmod":{
542
+ "p":0.9111111111,
543
+ "r":0.8541666667,
544
+ "f":0.8817204301
545
+ },
546
+ "expl":{
547
+ "p":0.9,
548
+ "r":0.9230769231,
549
+ "f":0.9113924051
550
  },
551
  "list":{
552
+ "p":0.3333333333,
553
+ "r":0.1764705882,
554
+ "f":0.2307692308
555
  },
556
+ "obl:lmod":{
557
+ "p":1.0,
558
+ "r":0.3333333333,
559
+ "f":0.5
560
+ },
561
+ "parataxis":{
562
  "p":0.0,
563
  "r":0.0,
564
  "f":0.0
565
  },
566
+ "orphan":{
567
+ "p":0.0,
568
+ "r":0.0,
569
+ "f":0.0
570
  },
571
+ "vocative":{
572
+ "p":0.0,
573
+ "r":0.0,
574
+ "f":0.0
575
  },
576
+ "discourse":{
577
+ "p":0.0,
578
+ "r":0.0,
579
+ "f":0.0
580
  },
581
+ "dislocated":{
582
+ "p":0.0,
583
+ "r":0.0,
584
+ "f":0.0
585
  },
586
+ "compound":{
587
  "p":0.0,
588
  "r":0.0,
589
  "f":0.0
590
  }
591
  },
592
+ "ents_p":0.8306010929,
593
+ "ents_r":0.8172043011,
594
+ "ents_f":0.8238482385,
 
 
 
 
595
  "ents_per_type":{
596
+ "LOC":{
597
+ "p":0.8,
598
+ "r":0.875,
599
+ "f":0.8358208955
600
  },
601
  "ORG":{
602
+ "p":0.8,
603
+ "r":0.7204968944,
604
+ "f":0.7581699346
605
+ },
606
+ "PER":{
607
+ "p":0.9060773481,
608
+ "r":0.9111111111,
609
+ "f":0.9085872576
610
  },
611
  "MISC":{
612
+ "p":0.7796610169,
613
+ "r":0.7603305785,
614
+ "f":0.769874477
615
+ }
616
+ },
617
+ "lemma_acc":0.9466699925,
618
+ "coref_lea_f1":0.4218334451,
619
+ "coref_lea_precision":0.4478869466,
620
+ "coref_lea_recall":0.398644375,
621
+ "nel_score":0.352,
622
+ "nel_score_desc":"micro F",
623
+ "nel_micro_p":0.8461538462,
624
+ "nel_micro_r":0.2222222222,
625
+ "nel_micro_f":0.352,
626
+ "nel_macro_p":0.8767857143,
627
+ "nel_macro_r":0.2475984839,
628
+ "nel_macro_f":0.3752026075,
629
+ "nel_f_per_type":{
630
+ "MISC":{
631
+ "p":1.0,
632
+ "r":0.2631578947,
633
+ "f":0.4166666667
634
+ },
635
+ "PER":{
636
+ "p":0.8571428571,
637
+ "r":0.1016949153,
638
+ "f":0.1818181818
639
  },
640
  "LOC":{
641
+ "p":1.0,
642
+ "r":0.4285714286,
643
+ "f":0.6
644
+ },
645
+ "ORG":{
646
+ "p":0.65,
647
+ "r":0.196969697,
648
+ "f":0.3023255814
649
  }
650
+ }
 
 
 
 
651
  },
652
  "sources":[
653
  {
654
+ "name":"UD Danish DDT v2.11",
655
  "url":"https://github.com/UniversalDependencies/UD_Danish-DDT",
656
  "license":"CC BY-SA 4.0",
657
  "author":"Johannsen, Anders; Mart\u00ednez Alonso, H\u00e9ctor; Plank, Barbara"
658
  },
659
  {
660
  "name":"DaNE",
661
+ "url":"https://huggingface.co/datasets/dane",
662
  "license":"CC BY-SA 4.0",
663
  "author":"Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders S\u00f8gaard"
664
  },
665
  {
666
+ "name":"DaCoref",
667
+ "url":"https://huggingface.co/datasets/alexandrainst/dacoref",
668
+ "license":"CC BY-SA 4.0",
669
+ "author":"Buch-Kromann, Matthias"
670
+ },
671
+ {
672
+ "name":"DaNED",
673
+ "url":"https://danlp-alexandra.readthedocs.io/en/stable/docs/datasets.html#daned",
674
+ "license":"CC BY-SA 4.0",
675
+ "author":"Barrett, M. J., Lam, H., Wu, M., Lacroix, O., Plank, B., & S\u00f8gaard, A."
676
+ },
677
+ {
678
+ "name":"jonfd/electra-small-nordic",
679
+ "author":"J\u00f3n Fri\u00f0rik Da\u00f0ason",
680
+ "url":"https://huggingface.co/jonfd/electra-small-nordic",
681
  "license":"CC BY 4.0"
682
  }
683
  ],
684
+ "notes":"\n\n### Training\nThis model was trained using [spaCy](https://spacy.io) and logged to [Weights & Biases](https://wandb.ai/kenevoldsen/dacy-v0.2.0). You can find all the training logs [here](https://wandb.ai/kenevoldsen/dacy-v0.2.0)."
 
 
 
685
  }
morphologizer/cfg CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "labels_morph":{
3
  "AdpType=Prep|POS=ADP":"AdpType=Prep",
4
  "Definite=Ind|Gender=Com|Number=Sing|POS=NOUN":"Definite=Ind|Gender=Com|Number=Sing",
@@ -14,149 +15,150 @@
14
  "Degree=Pos|Number=Plur|POS=ADJ":"Degree=Pos|Number=Plur",
15
  "Definite=Ind|Gender=Com|Number=Plur|POS=NOUN":"Definite=Ind|Gender=Com|Number=Plur",
16
  "POS=PUNCT":"",
 
17
  "POS=CCONJ":"",
18
- "Definite=Ind|Degree=Cmp|Number=Sing|POS=ADJ":"Definite=Ind|Degree=Cmp|Number=Sing",
19
- "Degree=Cmp|POS=ADJ":"Degree=Cmp",
20
- "POS=PRON|PartType=Inf":"PartType=Inf",
21
- "Gender=Com|Number=Sing|POS=DET|PronType=Ind":"Gender=Com|Number=Sing|PronType=Ind",
22
- "Definite=Ind|Degree=Pos|Number=Sing|POS=ADJ":"Definite=Ind|Degree=Pos|Number=Sing",
23
- "Case=Acc|Gender=Neut|Number=Sing|POS=PRON|Person=3|PronType=Prs":"Case=Acc|Gender=Neut|Number=Sing|Person=3|PronType=Prs",
24
  "Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN":"Definite=Ind|Gender=Neut|Number=Plur",
25
- "Definite=Def|Degree=Pos|Number=Sing|POS=ADJ":"Definite=Def|Degree=Pos|Number=Sing",
26
- "Gender=Neut|Number=Sing|POS=DET|PronType=Dem":"Gender=Neut|Number=Sing|PronType=Dem",
 
27
  "Degree=Pos|POS=ADV":"Degree=Pos",
28
- "Definite=Def|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":"Definite=Def|Number=Sing|Tense=Past|VerbForm=Part",
29
- "Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN":"Definite=Ind|Gender=Neut|Number=Sing",
30
- "POS=PRON|PronType=Dem":"PronType=Dem",
31
- "NumType=Card|POS=NUM":"NumType=Card",
32
- "Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ":"Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing",
33
- "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs":"Case=Acc|Gender=Com|Number=Sing|Person=3|PronType=Prs",
34
- "Degree=Pos|Gender=Com|Number=Sing|POS=ADJ":"Degree=Pos|Gender=Com|Number=Sing",
35
  "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs":"Case=Nom|Gender=Com|Number=Sing|Person=3|PronType=Prs",
36
- "NumType=Ord|POS=ADJ":"NumType=Ord",
37
- "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Gender=Com|Number=Sing|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
38
  "Mood=Ind|POS=AUX|Tense=Past|VerbForm=Fin|Voice=Act":"Mood=Ind|Tense=Past|VerbForm=Fin|Voice=Act",
39
- "POS=VERB|VerbForm=Inf|Voice=Act":"VerbForm=Inf|Voice=Act",
 
40
  "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act":"Mood=Ind|Tense=Past|VerbForm=Fin|Voice=Act",
41
- "POS=NOUN":"",
42
- "Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Pass":"Mood=Ind|Tense=Pres|VerbForm=Fin|Voice=Pass",
43
  "POS=ADP|PartType=Inf":"PartType=Inf",
 
 
44
  "Degree=Pos|POS=ADJ":"Degree=Pos",
 
 
 
45
  "Definite=Def|Gender=Com|Number=Plur|POS=NOUN":"Definite=Def|Gender=Com|Number=Plur",
46
- "Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs":"Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs",
 
 
 
47
  "Case=Gen|Definite=Def|Gender=Com|Number=Sing|POS=NOUN":"Case=Gen|Definite=Def|Gender=Com|Number=Sing",
 
 
48
  "POS=AUX|VerbForm=Inf|Voice=Act":"VerbForm=Inf|Voice=Act",
49
- "Definite=Ind|Degree=Pos|Gender=Com|Number=Sing|POS=ADJ":"Definite=Ind|Degree=Pos|Gender=Com|Number=Sing",
 
 
 
 
 
 
 
50
  "Gender=Com|Number=Sing|POS=DET|PronType=Dem":"Gender=Com|Number=Sing|PronType=Dem",
51
- "Number=Plur|POS=DET|PronType=Ind":"Number=Plur|PronType=Ind",
52
- "Gender=Com|Number=Sing|POS=PRON|PronType=Ind":"Gender=Com|Number=Sing|PronType=Ind",
53
- "Case=Acc|POS=PRON|Person=3|PronType=Prs|Reflex=Yes":"Case=Acc|Person=3|PronType=Prs|Reflex=Yes",
54
- "POS=PART|PartType=Inf":"PartType=Inf",
 
 
 
 
 
 
 
 
 
55
  "Gender=Neut|Number=Sing|POS=DET|PronType=Ind":"Gender=Neut|Number=Sing|PronType=Ind",
56
- "Case=Acc|Number=Plur|POS=PRON|Person=3|PronType=Prs":"Case=Acc|Number=Plur|Person=3|PronType=Prs",
57
- "Case=Gen|Definite=Def|Gender=Neut|Number=Sing|POS=NOUN":"Case=Gen|Definite=Def|Gender=Neut|Number=Sing",
58
- "Case=Nom|Number=Plur|POS=PRON|Person=3|PronType=Prs":"Case=Nom|Number=Plur|Person=3|PronType=Prs",
 
 
 
 
 
 
59
  "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs":"Case=Nom|Gender=Com|Number=Sing|Person=1|PronType=Prs",
60
- "Case=Nom|Gender=Com|POS=PRON|PronType=Ind":"Case=Nom|Gender=Com|PronType=Ind",
61
- "Gender=Neut|Number=Sing|POS=PRON|PronType=Ind":"Gender=Neut|Number=Sing|PronType=Ind",
62
- "Mood=Imp|POS=VERB":"Mood=Imp",
63
  "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":"Gender=Com|Number=Sing|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs",
64
- "Definite=Ind|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part":"Definite=Ind|Number=Sing|Tense=Past|VerbForm=Part",
65
- "POS=X":"",
 
 
 
 
 
 
 
66
  "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs":"Case=Nom|Gender=Com|Number=Plur|Person=1|PronType=Prs",
 
 
 
 
 
 
 
 
 
 
67
  "Case=Gen|Definite=Def|Gender=Com|Number=Plur|POS=NOUN":"Case=Gen|Definite=Def|Gender=Com|Number=Plur",
68
- "POS=VERB|Tense=Pres|VerbForm=Part":"Tense=Pres|VerbForm=Part",
69
- "Number=Plur|POS=PRON|PronType=Int,Rel":"Number=Plur|PronType=Int,Rel",
70
- "POS=VERB|VerbForm=Inf|Voice=Pass":"VerbForm=Inf|Voice=Pass",
71
- "Case=Gen|Definite=Ind|Gender=Com|Number=Sing|POS=NOUN":"Case=Gen|Definite=Ind|Gender=Com|Number=Sing",
72
- "Degree=Cmp|POS=ADV":"Degree=Cmp",
73
- "POS=ADV|PartType=Inf":"PartType=Inf",
74
- "Degree=Sup|POS=ADV":"Degree=Sup",
75
  "Number=Plur|POS=PRON|PronType=Dem":"Number=Plur|PronType=Dem",
76
- "Number=Plur|POS=PRON|PronType=Ind":"Number=Plur|PronType=Ind",
77
- "Definite=Def|Gender=Neut|Number=Plur|POS=NOUN":"Definite=Def|Gender=Neut|Number=Plur",
78
- "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs":"Case=Acc|Gender=Com|Number=Sing|Person=1|PronType=Prs",
79
- "Case=Gen|POS=PROPN":"Case=Gen",
80
- "POS=ADP":"",
81
  "Degree=Cmp|Number=Plur|POS=ADJ":"Degree=Cmp|Number=Plur",
82
- "Definite=Def|Degree=Sup|POS=ADJ":"Definite=Def|Degree=Sup",
83
- "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":"Gender=Neut|Number=Sing|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs",
84
- "Degree=Pos|Number=Sing|POS=ADJ":"Degree=Pos|Number=Sing",
85
- "Number=Plur|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Number=Plur|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
86
  "Gender=Com|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":"Gender=Com|Number=Sing|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
87
  "Number=Plur|POS=PRON|PronType=Rcp":"Number=Plur|PronType=Rcp",
 
88
  "Case=Gen|Degree=Cmp|POS=ADJ":"Case=Gen|Degree=Cmp",
89
  "Case=Gen|Definite=Def|Gender=Neut|Number=Plur|POS=NOUN":"Case=Gen|Definite=Def|Gender=Neut|Number=Plur",
90
- "Number[psor]=Plur|POS=DET|Person=3|Poss=Yes|PronType=Prs":"Number[psor]=Plur|Person=3|Poss=Yes|PronType=Prs",
91
- "POS=INTJ":"",
92
- "Number=Plur|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":"Number=Plur|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs",
93
- "Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ":"Degree=Pos|Gender=Neut|Number=Sing",
94
- "Gender=Neut|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":"Gender=Neut|Number=Sing|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form",
95
- "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs":"Case=Acc|Gender=Com|Number=Sing|Person=2|PronType=Prs",
96
- "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":"Gender=Com|Number=Sing|Number[psor]=Sing|Person=2|Poss=Yes|PronType=Prs",
97
- "Case=Gen|Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN":"Case=Gen|Definite=Ind|Gender=Neut|Number=Plur",
98
- "Number=Sing|POS=PRON|PronType=Int,Rel":"Number=Sing|PronType=Int,Rel",
99
  "Number=Plur|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":"Number=Plur|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form",
100
- "Gender=Neut|Number=Sing|POS=PRON|PronType=Int,Rel":"Gender=Neut|Number=Sing|PronType=Int,Rel",
101
- "Definite=Def|Degree=Sup|Number=Plur|POS=ADJ":"Definite=Def|Degree=Sup|Number=Plur",
102
- "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs":"Case=Nom|Gender=Com|Number=Sing|Person=2|PronType=Prs",
103
- "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Gender=Neut|Number=Sing|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
104
- "Definite=Ind|Number=Sing|POS=NOUN":"Definite=Ind|Number=Sing",
105
- "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":"Number=Plur|Tense=Past|VerbForm=Part",
106
  "Number=Plur|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Number=Plur|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
107
- "POS=SYM":"",
108
- "Case=Nom|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs":"Case=Nom|Gender=Com|Person=2|Polite=Form|PronType=Prs",
109
- "Degree=Sup|POS=ADJ":"Degree=Sup",
110
- "Number=Plur|POS=DET|PronType=Ind|Style=Arch":"Number=Plur|PronType=Ind|Style=Arch",
111
- "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Dem":"Case=Gen|Gender=Com|Number=Sing|PronType=Dem",
112
- "Foreign=Yes|POS=X":"Foreign=Yes",
113
  "POS=DET|Person=2|Polite=Form|Poss=Yes|PronType=Prs":"Person=2|Polite=Form|Poss=Yes|PronType=Prs",
114
- "Gender=Neut|Number=Sing|POS=PRON|PronType=Dem":"Gender=Neut|Number=Sing|PronType=Dem",
115
- "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs":"Case=Acc|Gender=Com|Number=Plur|Person=1|PronType=Prs",
116
- "Case=Gen|Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN":"Case=Gen|Definite=Ind|Gender=Neut|Number=Sing",
117
- "Case=Gen|POS=PRON|PronType=Int,Rel":"Case=Gen|PronType=Int,Rel",
118
- "Gender=Com|Number=Sing|POS=PRON|PronType=Dem":"Gender=Com|Number=Sing|PronType=Dem",
119
- "Abbr=Yes|POS=X":"Abbr=Yes",
120
- "Case=Gen|Definite=Ind|Gender=Com|Number=Plur|POS=NOUN":"Case=Gen|Definite=Ind|Gender=Com|Number=Plur",
121
  "Definite=Def|Degree=Abs|POS=ADJ":"Definite=Def|Degree=Abs",
122
- "Definite=Ind|Degree=Sup|Number=Sing|POS=ADJ":"Definite=Ind|Degree=Sup|Number=Sing",
123
- "Definite=Ind|POS=NOUN":"Definite=Ind",
124
- "Gender=Com|Number=Plur|POS=NOUN":"Gender=Com|Number=Plur",
125
- "Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs":"Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs",
126
- "Gender=Com|POS=PRON|PronType=Int,Rel":"Gender=Com|PronType=Int,Rel",
127
- "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs":"Case=Nom|Gender=Com|Number=Plur|Person=2|PronType=Prs",
128
  "Degree=Abs|POS=ADV":"Degree=Abs",
129
- "POS=VERB|VerbForm=Ger":"VerbForm=Ger",
130
- "POS=VERB|Tense=Past|VerbForm=Part":"Tense=Past|VerbForm=Part",
131
- "Definite=Def|Degree=Sup|Number=Sing|POS=ADJ":"Definite=Def|Degree=Sup|Number=Sing",
132
- "Number=Plur|Number[psor]=Plur|POS=PRON|Person=1|Poss=Yes|PronType=Prs|Style=Form":"Number=Plur|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form",
133
  "Case=Gen|Definite=Def|Degree=Pos|Number=Sing|POS=ADJ":"Case=Gen|Definite=Def|Degree=Pos|Number=Sing",
134
- "Case=Gen|Degree=Pos|Number=Plur|POS=ADJ":"Case=Gen|Degree=Pos|Number=Plur",
135
- "Case=Acc|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs":"Case=Acc|Gender=Com|Person=2|Polite=Form|PronType=Prs",
136
  "Gender=Com|Number=Sing|POS=PRON|PronType=Int,Rel":"Gender=Com|Number=Sing|PronType=Int,Rel",
137
- "POS=VERB|Tense=Pres":"Tense=Pres",
138
- "Case=Gen|Number=Plur|POS=DET|PronType=Ind":"Case=Gen|Number=Plur|PronType=Ind",
139
- "Number[psor]=Plur|POS=DET|Person=2|Poss=Yes|PronType=Prs":"Number[psor]=Plur|Person=2|Poss=Yes|PronType=Prs",
140
- "POS=PRON|Person=2|Polite=Form|Poss=Yes|PronType=Prs":"Person=2|Polite=Form|Poss=Yes|PronType=Prs",
141
  "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":"Gender=Neut|Number=Sing|Number[psor]=Sing|Person=2|Poss=Yes|PronType=Prs",
142
- "POS=AUX|Tense=Pres|VerbForm=Part":"Tense=Pres|VerbForm=Part",
143
- "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Pass":"Mood=Ind|Tense=Past|VerbForm=Fin|Voice=Pass",
144
- "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Gender=Com|Number=Sing|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
145
- "Degree=Sup|Number=Plur|POS=ADJ":"Degree=Sup|Number=Plur",
146
- "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs":"Case=Acc|Gender=Com|Number=Plur|Person=2|PronType=Prs",
147
- "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Gender=Neut|Number=Sing|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
148
- "Definite=Ind|Number=Plur|POS=NOUN":"Definite=Ind|Number=Plur",
149
- "Case=Gen|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":"Case=Gen|Number=Plur|Tense=Past|VerbForm=Part",
150
- "Mood=Imp|POS=AUX":"Mood=Imp",
151
  "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=1|Poss=Yes|PronType=Prs":"Gender=Com|Number=Sing|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs",
152
- "Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs":"Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs",
153
- "Definite=Def|Gender=Com|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":"Definite=Def|Gender=Com|Number=Sing|Tense=Past|VerbForm=Part",
154
  "Number=Plur|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":"Number=Plur|Number[psor]=Sing|Person=2|Poss=Yes|PronType=Prs",
 
 
155
  "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Ind":"Case=Gen|Gender=Com|Number=Sing|PronType=Ind",
 
 
156
  "Case=Gen|POS=NOUN":"Case=Gen",
157
- "Number[psor]=Plur|POS=PRON|Person=3|Poss=Yes|PronType=Prs":"Number[psor]=Plur|Person=3|Poss=Yes|PronType=Prs",
158
- "POS=DET|PronType=Dem":"PronType=Dem",
159
- "Definite=Def|Number=Plur|POS=NOUN":"Definite=Def|Number=Plur"
160
  },
161
  "labels_pos":{
162
  "AdpType=Prep|POS=ADP":85,
@@ -173,148 +175,150 @@
173
  "Degree=Pos|Number=Plur|POS=ADJ":84,
174
  "Definite=Ind|Gender=Com|Number=Plur|POS=NOUN":92,
175
  "POS=PUNCT":97,
 
176
  "POS=CCONJ":89,
177
- "Definite=Ind|Degree=Cmp|Number=Sing|POS=ADJ":84,
178
- "Degree=Cmp|POS=ADJ":84,
179
- "POS=PRON|PartType=Inf":95,
180
- "Gender=Com|Number=Sing|POS=DET|PronType=Ind":90,
181
- "Definite=Ind|Degree=Pos|Number=Sing|POS=ADJ":84,
182
- "Case=Acc|Gender=Neut|Number=Sing|POS=PRON|Person=3|PronType=Prs":95,
183
  "Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN":92,
184
- "Definite=Def|Degree=Pos|Number=Sing|POS=ADJ":84,
185
- "Gender=Neut|Number=Sing|POS=DET|PronType=Dem":90,
 
186
  "Degree=Pos|POS=ADV":86,
187
- "Definite=Def|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":100,
188
- "Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN":92,
189
- "POS=PRON|PronType=Dem":95,
190
- "NumType=Card|POS=NUM":93,
191
- "Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ":84,
192
- "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs":95,
193
- "Degree=Pos|Gender=Com|Number=Sing|POS=ADJ":84,
194
  "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs":95,
195
- "NumType=Ord|POS=ADJ":84,
196
- "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":90,
197
  "Mood=Ind|POS=AUX|Tense=Past|VerbForm=Fin|Voice=Act":87,
198
- "POS=VERB|VerbForm=Inf|Voice=Act":100,
 
199
  "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act":100,
200
- "POS=NOUN":92,
201
- "Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Pass":100,
202
  "POS=ADP|PartType=Inf":85,
 
 
203
  "Degree=Pos|POS=ADJ":84,
 
 
 
204
  "Definite=Def|Gender=Com|Number=Plur|POS=NOUN":92,
205
- "Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs":90,
 
 
 
206
  "Case=Gen|Definite=Def|Gender=Com|Number=Sing|POS=NOUN":92,
 
 
207
  "POS=AUX|VerbForm=Inf|Voice=Act":87,
208
- "Definite=Ind|Degree=Pos|Gender=Com|Number=Sing|POS=ADJ":84,
 
 
 
 
 
 
 
209
  "Gender=Com|Number=Sing|POS=DET|PronType=Dem":90,
210
- "Number=Plur|POS=DET|PronType=Ind":90,
211
- "Gender=Com|Number=Sing|POS=PRON|PronType=Ind":95,
212
- "Case=Acc|POS=PRON|Person=3|PronType=Prs|Reflex=Yes":95,
213
- "POS=PART|PartType=Inf":94,
 
 
 
 
 
 
 
 
 
214
  "Gender=Neut|Number=Sing|POS=DET|PronType=Ind":90,
215
- "Case=Acc|Number=Plur|POS=PRON|Person=3|PronType=Prs":95,
216
- "Case=Gen|Definite=Def|Gender=Neut|Number=Sing|POS=NOUN":92,
217
- "Case=Nom|Number=Plur|POS=PRON|Person=3|PronType=Prs":95,
 
 
 
 
 
 
218
  "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs":95,
219
- "Case=Nom|Gender=Com|POS=PRON|PronType=Ind":95,
220
- "Gender=Neut|Number=Sing|POS=PRON|PronType=Ind":95,
221
- "Mood=Imp|POS=VERB":100,
222
  "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":90,
223
- "Definite=Ind|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part":87,
224
- "POS=X":101,
 
 
 
 
 
 
 
225
  "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs":95,
 
 
 
 
 
 
 
 
 
 
226
  "Case=Gen|Definite=Def|Gender=Com|Number=Plur|POS=NOUN":92,
227
- "POS=VERB|Tense=Pres|VerbForm=Part":100,
228
- "Number=Plur|POS=PRON|PronType=Int,Rel":95,
229
- "POS=VERB|VerbForm=Inf|Voice=Pass":100,
230
- "Case=Gen|Definite=Ind|Gender=Com|Number=Sing|POS=NOUN":92,
231
- "Degree=Cmp|POS=ADV":86,
232
- "POS=ADV|PartType=Inf":86,
233
- "Degree=Sup|POS=ADV":86,
234
  "Number=Plur|POS=PRON|PronType=Dem":95,
235
- "Number=Plur|POS=PRON|PronType=Ind":95,
236
- "Definite=Def|Gender=Neut|Number=Plur|POS=NOUN":92,
237
- "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs":95,
238
- "Case=Gen|POS=PROPN":96,
239
- "POS=ADP":85,
240
  "Degree=Cmp|Number=Plur|POS=ADJ":84,
241
- "Definite=Def|Degree=Sup|POS=ADJ":84,
242
- "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":90,
243
- "Degree=Pos|Number=Sing|POS=ADJ":84,
244
- "Number=Plur|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":90,
245
  "Gender=Com|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":90,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
246
  "Number=Plur|POS=PRON|PronType=Rcp":95,
 
247
  "Case=Gen|Degree=Cmp|POS=ADJ":84,
248
  "Case=Gen|Definite=Def|Gender=Neut|Number=Plur|POS=NOUN":92,
249
- "Number[psor]=Plur|POS=DET|Person=3|Poss=Yes|PronType=Prs":90,
250
- "POS=INTJ":91,
251
- "Number=Plur|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":90,
252
- "Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ":84,
253
- "Gender=Neut|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":90,
254
- "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs":95,
255
- "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":90,
256
- "Case=Gen|Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN":92,
257
- "Number=Sing|POS=PRON|PronType=Int,Rel":95,
258
  "Number=Plur|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":90,
259
- "Gender=Neut|Number=Sing|POS=PRON|PronType=Int,Rel":95,
260
- "Definite=Def|Degree=Sup|Number=Plur|POS=ADJ":84,
261
- "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs":95,
262
- "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":90,
263
- "Definite=Ind|Number=Sing|POS=NOUN":92,
264
- "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":100,
265
  "Number=Plur|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":95,
266
- "POS=SYM":99,
267
- "Case=Nom|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs":95,
268
- "Degree=Sup|POS=ADJ":84,
269
- "Number=Plur|POS=DET|PronType=Ind|Style=Arch":90,
270
- "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Dem":90,
271
- "Foreign=Yes|POS=X":101,
272
  "POS=DET|Person=2|Polite=Form|Poss=Yes|PronType=Prs":90,
273
- "Gender=Neut|Number=Sing|POS=PRON|PronType=Dem":95,
274
- "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs":95,
275
- "Case=Gen|Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN":92,
276
- "Case=Gen|POS=PRON|PronType=Int,Rel":95,
277
- "Gender=Com|Number=Sing|POS=PRON|PronType=Dem":95,
278
- "Abbr=Yes|POS=X":101,
279
- "Case=Gen|Definite=Ind|Gender=Com|Number=Plur|POS=NOUN":92,
280
  "Definite=Def|Degree=Abs|POS=ADJ":84,
281
- "Definite=Ind|Degree=Sup|Number=Sing|POS=ADJ":84,
282
- "Definite=Ind|POS=NOUN":92,
283
- "Gender=Com|Number=Plur|POS=NOUN":92,
284
- "Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs":90,
285
- "Gender=Com|POS=PRON|PronType=Int,Rel":95,
286
- "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs":95,
287
  "Degree=Abs|POS=ADV":86,
288
- "POS=VERB|VerbForm=Ger":100,
289
- "POS=VERB|Tense=Past|VerbForm=Part":100,
290
- "Definite=Def|Degree=Sup|Number=Sing|POS=ADJ":84,
291
- "Number=Plur|Number[psor]=Plur|POS=PRON|Person=1|Poss=Yes|PronType=Prs|Style=Form":95,
292
  "Case=Gen|Definite=Def|Degree=Pos|Number=Sing|POS=ADJ":84,
293
- "Case=Gen|Degree=Pos|Number=Plur|POS=ADJ":84,
294
- "Case=Acc|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs":95,
295
  "Gender=Com|Number=Sing|POS=PRON|PronType=Int,Rel":95,
296
- "POS=VERB|Tense=Pres":100,
297
- "Case=Gen|Number=Plur|POS=DET|PronType=Ind":90,
298
- "Number[psor]=Plur|POS=DET|Person=2|Poss=Yes|PronType=Prs":90,
299
- "POS=PRON|Person=2|Polite=Form|Poss=Yes|PronType=Prs":95,
300
  "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":90,
301
- "POS=AUX|Tense=Pres|VerbForm=Part":87,
302
- "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Pass":100,
303
- "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":95,
304
- "Degree=Sup|Number=Plur|POS=ADJ":84,
305
- "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs":95,
306
- "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":95,
307
- "Definite=Ind|Number=Plur|POS=NOUN":92,
308
- "Case=Gen|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":100,
309
- "Mood=Imp|POS=AUX":87,
310
  "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=1|Poss=Yes|PronType=Prs":95,
311
- "Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs":95,
312
- "Definite=Def|Gender=Com|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":100,
313
  "Number=Plur|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":90,
 
 
314
  "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Ind":90,
 
 
315
  "Case=Gen|POS=NOUN":92,
316
- "Number[psor]=Plur|POS=PRON|Person=3|Poss=Yes|PronType=Prs":95,
317
- "POS=DET|PronType=Dem":90,
318
- "Definite=Def|Number=Plur|POS=NOUN":92
319
- }
320
  }
 
1
  {
2
+ "extend":false,
3
  "labels_morph":{
4
  "AdpType=Prep|POS=ADP":"AdpType=Prep",
5
  "Definite=Ind|Gender=Com|Number=Sing|POS=NOUN":"Definite=Ind|Gender=Com|Number=Sing",
 
15
  "Degree=Pos|Number=Plur|POS=ADJ":"Degree=Pos|Number=Plur",
16
  "Definite=Ind|Gender=Com|Number=Plur|POS=NOUN":"Definite=Ind|Gender=Com|Number=Plur",
17
  "POS=PUNCT":"",
18
+ "NumType=Ord|POS=ADJ":"NumType=Ord",
19
  "POS=CCONJ":"",
 
 
 
 
 
 
20
  "Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN":"Definite=Ind|Gender=Neut|Number=Plur",
21
+ "POS=VERB|VerbForm=Inf|Voice=Act":"VerbForm=Inf|Voice=Act",
22
+ "Case=Acc|Gender=Neut|Number=Sing|POS=PRON|Person=3|PronType=Prs":"Case=Acc|Gender=Neut|Number=Sing|Person=3|PronType=Prs",
23
+ "Degree=Sup|POS=ADV":"Degree=Sup",
24
  "Degree=Pos|POS=ADV":"Degree=Pos",
25
+ "Gender=Com|Number=Sing|POS=DET|PronType=Ind":"Gender=Com|Number=Sing|PronType=Ind",
26
+ "Number=Plur|POS=DET|PronType=Ind":"Number=Plur|PronType=Ind",
27
+ "POS=ADP":"",
28
+ "POS=ADV|PartType=Inf":"PartType=Inf",
 
 
 
29
  "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs":"Case=Nom|Gender=Com|Number=Sing|Person=3|PronType=Prs",
 
 
30
  "Mood=Ind|POS=AUX|Tense=Past|VerbForm=Fin|Voice=Act":"Mood=Ind|Tense=Past|VerbForm=Fin|Voice=Act",
31
+ "Definite=Def|Degree=Pos|Number=Sing|POS=ADJ":"Definite=Def|Degree=Pos|Number=Sing",
32
+ "Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs":"Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs",
33
  "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act":"Mood=Ind|Tense=Past|VerbForm=Fin|Voice=Act",
 
 
34
  "POS=ADP|PartType=Inf":"PartType=Inf",
35
+ "Definite=Ind|Degree=Pos|Gender=Com|Number=Sing|POS=ADJ":"Definite=Ind|Degree=Pos|Gender=Com|Number=Sing",
36
+ "NumType=Card|POS=NUM":"NumType=Card",
37
  "Degree=Pos|POS=ADJ":"Degree=Pos",
38
+ "Definite=Ind|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part":"Definite=Ind|Number=Sing|Tense=Past|VerbForm=Part",
39
+ "POS=PART|PartType=Inf":"PartType=Inf",
40
+ "Case=Acc|POS=PRON|Person=3|PronType=Prs|Reflex=Yes":"Case=Acc|Person=3|PronType=Prs|Reflex=Yes",
41
  "Definite=Def|Gender=Com|Number=Plur|POS=NOUN":"Definite=Def|Gender=Com|Number=Plur",
42
+ "Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN":"Definite=Ind|Gender=Neut|Number=Sing",
43
+ "Number[psor]=Plur|POS=DET|Person=3|Poss=Yes|PronType=Prs":"Number[psor]=Plur|Person=3|Poss=Yes|PronType=Prs",
44
+ "POS=VERB|Tense=Pres|VerbForm=Part":"Tense=Pres|VerbForm=Part",
45
+ "Case=Nom|Number=Plur|POS=PRON|Person=3|PronType=Prs":"Case=Nom|Number=Plur|Person=3|PronType=Prs",
46
  "Case=Gen|Definite=Def|Gender=Com|Number=Sing|POS=NOUN":"Case=Gen|Definite=Def|Gender=Com|Number=Sing",
47
+ "Definite=Def|Degree=Sup|Number=Plur|POS=ADJ":"Definite=Def|Degree=Sup|Number=Plur",
48
+ "Case=Acc|Number=Plur|POS=PRON|Person=3|PronType=Prs":"Case=Acc|Number=Plur|Person=3|PronType=Prs",
49
  "POS=AUX|VerbForm=Inf|Voice=Act":"VerbForm=Inf|Voice=Act",
50
+ "Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ":"Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing",
51
+ "Definite=Ind|Degree=Cmp|Number=Sing|POS=ADJ":"Definite=Ind|Degree=Cmp|Number=Sing",
52
+ "Degree=Cmp|POS=ADJ":"Degree=Cmp",
53
+ "POS=PRON|PartType=Inf":"PartType=Inf",
54
+ "Definite=Ind|Degree=Pos|Number=Sing|POS=ADJ":"Definite=Ind|Degree=Pos|Number=Sing",
55
+ "Case=Nom|Gender=Com|POS=PRON|PronType=Ind":"Case=Nom|Gender=Com|PronType=Ind",
56
+ "Number=Plur|POS=PRON|PronType=Ind":"Number=Plur|PronType=Ind",
57
+ "POS=INTJ":"",
58
  "Gender=Com|Number=Sing|POS=DET|PronType=Dem":"Gender=Com|Number=Sing|PronType=Dem",
59
+ "Case=Gen|Number=Plur|POS=DET|PronType=Ind":"Case=Gen|Number=Plur|PronType=Ind",
60
+ "Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Pass":"Mood=Ind|Tense=Pres|VerbForm=Fin|Voice=Pass",
61
+ "Definite=Def|Gender=Neut|Number=Plur|POS=NOUN":"Definite=Def|Gender=Neut|Number=Plur",
62
+ "Degree=Cmp|POS=ADV":"Degree=Cmp",
63
+ "Number=Plur|Number[psor]=Plur|POS=PRON|Person=1|Poss=Yes|PronType=Prs|Style=Form":"Number=Plur|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form",
64
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs":"Case=Acc|Gender=Com|Number=Sing|Person=3|PronType=Prs",
65
+ "Number=Plur|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Number=Plur|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
66
+ "Case=Gen|POS=PROPN":"Case=Gen",
67
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Ind":"Gender=Neut|Number=Sing|PronType=Ind",
68
+ "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":"Number=Plur|Tense=Past|VerbForm=Part",
69
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Gender=Neut|Number=Sing|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
70
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs":"Case=Acc|Gender=Com|Number=Sing|Person=1|PronType=Prs",
71
+ "Definite=Def|Degree=Sup|POS=ADJ":"Definite=Def|Degree=Sup",
72
  "Gender=Neut|Number=Sing|POS=DET|PronType=Ind":"Gender=Neut|Number=Sing|PronType=Ind",
73
+ "Case=Gen|Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN":"Case=Gen|Definite=Ind|Gender=Neut|Number=Sing",
74
+ "Gender=Neut|Number=Sing|POS=DET|PronType=Dem":"Gender=Neut|Number=Sing|PronType=Dem",
75
+ "Definite=Def|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":"Definite=Def|Number=Sing|Tense=Past|VerbForm=Part",
76
+ "POS=PRON|PronType=Dem":"PronType=Dem",
77
+ "Degree=Pos|Gender=Com|Number=Sing|POS=ADJ":"Degree=Pos|Gender=Com|Number=Sing",
78
+ "Number=Plur|POS=NUM":"Number=Plur",
79
+ "POS=VERB|VerbForm=Inf|Voice=Pass":"VerbForm=Inf|Voice=Pass",
80
+ "Definite=Def|Degree=Sup|Number=Sing|POS=ADJ":"Definite=Def|Degree=Sup|Number=Sing",
81
+ "Number=Sing|POS=PRON|PronType=Int,Rel":"Number=Sing|PronType=Int,Rel",
82
  "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs":"Case=Nom|Gender=Com|Number=Sing|Person=1|PronType=Prs",
83
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":"Gender=Neut|Number=Sing|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs",
 
 
84
  "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":"Gender=Com|Number=Sing|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs",
85
+ "POS=PRON":"",
86
+ "Definite=Ind|Number=Sing|POS=NOUN":"Definite=Ind|Number=Sing",
87
+ "Definite=Ind|Number=Sing|POS=NUM":"Definite=Ind|Number=Sing",
88
+ "Case=Gen|Definite=Ind|Gender=Com|Number=Sing|POS=NOUN":"Case=Gen|Definite=Ind|Gender=Com|Number=Sing",
89
+ "Foreign=Yes|POS=ADV":"Foreign=Yes",
90
+ "POS=NOUN":"",
91
+ "Case=Gen|Definite=Def|Gender=Neut|Number=Sing|POS=NOUN":"Case=Gen|Definite=Def|Gender=Neut|Number=Sing",
92
+ "Gender=Com|Number=Plur|POS=NOUN":"Gender=Com|Number=Plur",
93
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Int,Rel":"Gender=Neut|Number=Sing|PronType=Int,Rel",
94
  "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs":"Case=Nom|Gender=Com|Number=Plur|Person=1|PronType=Prs",
95
+ "Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs":"Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs",
96
+ "Gender=Com|Number=Sing|POS=PRON|PronType=Ind":"Gender=Com|Number=Sing|PronType=Ind",
97
+ "Case=Gen|Definite=Ind|Gender=Com|Number=Plur|POS=NOUN":"Case=Gen|Definite=Ind|Gender=Com|Number=Plur",
98
+ "Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ":"Degree=Pos|Gender=Neut|Number=Sing",
99
+ "Degree=Sup|POS=ADJ":"Degree=Sup",
100
+ "Degree=Pos|Number=Sing|POS=ADJ":"Degree=Pos|Number=Sing",
101
+ "Mood=Imp|POS=VERB":"Mood=Imp",
102
+ "Case=Nom|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs":"Case=Nom|Gender=Com|Person=2|Polite=Form|PronType=Prs",
103
+ "Case=Acc|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs":"Case=Acc|Gender=Com|Person=2|Polite=Form|PronType=Prs",
104
+ "POS=X":"",
105
  "Case=Gen|Definite=Def|Gender=Com|Number=Plur|POS=NOUN":"Case=Gen|Definite=Def|Gender=Com|Number=Plur",
 
 
 
 
 
 
 
106
  "Number=Plur|POS=PRON|PronType=Dem":"Number=Plur|PronType=Dem",
107
+ "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs":"Case=Acc|Gender=Com|Number=Plur|Person=1|PronType=Prs",
108
+ "Number=Plur|POS=PRON|PronType=Int,Rel":"Number=Plur|PronType=Int,Rel",
109
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Gender=Com|Number=Sing|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
 
 
110
  "Degree=Cmp|Number=Plur|POS=ADJ":"Degree=Cmp|Number=Plur",
111
+ "Number=Plur|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":"Number=Plur|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs",
 
 
 
112
  "Gender=Com|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":"Gender=Com|Number=Sing|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form",
113
+ "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs":"Case=Nom|Gender=Com|Number=Sing|Person=2|PronType=Prs",
114
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs":"Case=Acc|Gender=Com|Number=Sing|Person=2|PronType=Prs",
115
+ "Gender=Com|POS=PRON|PronType=Int,Rel":"Gender=Com|PronType=Int,Rel",
116
+ "Case=Gen|Degree=Pos|Number=Plur|POS=ADJ":"Case=Gen|Degree=Pos|Number=Plur",
117
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Gender=Neut|Number=Sing|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
118
+ "POS=VERB|VerbForm=Ger":"VerbForm=Ger",
119
+ "Gender=Com|Number=Sing|POS=PRON|PronType=Dem":"Gender=Com|Number=Sing|PronType=Dem",
120
+ "Case=Gen|POS=PRON|PronType=Int,Rel":"Case=Gen|PronType=Int,Rel",
121
+ "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Pass":"Mood=Ind|Tense=Past|VerbForm=Fin|Voice=Pass",
122
+ "Abbr=Yes|POS=X":"Abbr=Yes",
123
+ "Case=Gen|Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN":"Case=Gen|Definite=Ind|Gender=Neut|Number=Plur",
124
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":"Gender=Com|Number=Sing|Number[psor]=Sing|Person=2|Poss=Yes|PronType=Prs",
125
+ "Definite=Ind|Number=Plur|POS=NOUN":"Definite=Ind|Number=Plur",
126
+ "Foreign=Yes|POS=X":"Foreign=Yes",
127
  "Number=Plur|POS=PRON|PronType=Rcp":"Number=Plur|PronType=Rcp",
128
+ "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs":"Case=Nom|Gender=Com|Number=Plur|Person=2|PronType=Prs",
129
  "Case=Gen|Degree=Cmp|POS=ADJ":"Case=Gen|Degree=Cmp",
130
  "Case=Gen|Definite=Def|Gender=Neut|Number=Plur|POS=NOUN":"Case=Gen|Definite=Def|Gender=Neut|Number=Plur",
131
+ "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs":"Case=Acc|Gender=Com|Number=Plur|Person=2|PronType=Prs",
132
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Dem":"Gender=Neut|Number=Sing|PronType=Dem",
 
 
 
 
 
 
 
133
  "Number=Plur|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":"Number=Plur|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form",
134
+ "Gender=Neut|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":"Gender=Neut|Number=Sing|Number[psor]=Plur|Person=1|Poss=Yes|PronType=Prs|Style=Form",
 
 
 
 
 
135
  "Number=Plur|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":"Number=Plur|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes",
136
+ "Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs":"Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs",
137
+ "Case=Gen|Number=Plur|POS=PRON|PronType=Rcp":"Case=Gen|Number=Plur|PronType=Rcp",
 
 
 
 
138
  "POS=DET|Person=2|Polite=Form|Poss=Yes|PronType=Prs":"Person=2|Polite=Form|Poss=Yes|PronType=Prs",
139
+ "POS=SYM":"",
140
+ "POS=DET|PronType=Dem":"PronType=Dem",
141
+ "Gender=Com|Number=Sing|POS=NUM":"Gender=Com|Number=Sing",
142
+ "Number[psor]=Plur|POS=DET|Person=2|Poss=Yes|PronType=Prs":"Number[psor]=Plur|Person=2|Poss=Yes|PronType=Prs",
143
+ "Case=Gen|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":"Case=Gen|Number=Plur|Tense=Past|VerbForm=Part",
 
 
144
  "Definite=Def|Degree=Abs|POS=ADJ":"Definite=Def|Degree=Abs",
145
+ "POS=VERB|Tense=Pres":"Tense=Pres",
146
+ "Definite=Ind|Gender=Neut|Number=Sing|POS=NUM":"Definite=Ind|Gender=Neut|Number=Sing",
 
 
 
 
147
  "Degree=Abs|POS=ADV":"Degree=Abs",
 
 
 
 
148
  "Case=Gen|Definite=Def|Degree=Pos|Number=Sing|POS=ADJ":"Case=Gen|Definite=Def|Degree=Pos|Number=Sing",
 
 
149
  "Gender=Com|Number=Sing|POS=PRON|PronType=Int,Rel":"Gender=Com|Number=Sing|PronType=Int,Rel",
150
+ "POS=VERB|Tense=Past|VerbForm=Part":"Tense=Past|VerbForm=Part",
151
+ "Definite=Ind|Degree=Sup|Number=Sing|POS=ADJ":"Definite=Ind|Degree=Sup|Number=Sing",
 
 
152
  "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":"Gender=Neut|Number=Sing|Number[psor]=Sing|Person=2|Poss=Yes|PronType=Prs",
 
 
 
 
 
 
 
 
 
153
  "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=1|Poss=Yes|PronType=Prs":"Gender=Com|Number=Sing|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs",
 
 
154
  "Number=Plur|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":"Number=Plur|Number[psor]=Sing|Person=2|Poss=Yes|PronType=Prs",
155
+ "Number[psor]=Plur|POS=PRON|Person=3|Poss=Yes|PronType=Prs":"Number[psor]=Plur|Person=3|Poss=Yes|PronType=Prs",
156
+ "Definite=Ind|POS=NOUN":"Definite=Ind",
157
  "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Ind":"Case=Gen|Gender=Com|Number=Sing|PronType=Ind",
158
+ "Definite=Ind|Gender=Com|Number=Sing|POS=NUM":"Definite=Ind|Gender=Com|Number=Sing",
159
+ "Definite=Def|Number=Plur|POS=NOUN":"Definite=Def|Number=Plur",
160
  "Case=Gen|POS=NOUN":"Case=Gen",
161
+ "POS=AUX|Tense=Pres|VerbForm=Part":"Tense=Pres|VerbForm=Part"
 
 
162
  },
163
  "labels_pos":{
164
  "AdpType=Prep|POS=ADP":85,
 
175
  "Degree=Pos|Number=Plur|POS=ADJ":84,
176
  "Definite=Ind|Gender=Com|Number=Plur|POS=NOUN":92,
177
  "POS=PUNCT":97,
178
+ "NumType=Ord|POS=ADJ":84,
179
  "POS=CCONJ":89,
 
 
 
 
 
 
180
  "Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN":92,
181
+ "POS=VERB|VerbForm=Inf|Voice=Act":100,
182
+ "Case=Acc|Gender=Neut|Number=Sing|POS=PRON|Person=3|PronType=Prs":95,
183
+ "Degree=Sup|POS=ADV":86,
184
  "Degree=Pos|POS=ADV":86,
185
+ "Gender=Com|Number=Sing|POS=DET|PronType=Ind":90,
186
+ "Number=Plur|POS=DET|PronType=Ind":90,
187
+ "POS=ADP":85,
188
+ "POS=ADV|PartType=Inf":86,
 
 
 
189
  "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs":95,
 
 
190
  "Mood=Ind|POS=AUX|Tense=Past|VerbForm=Fin|Voice=Act":87,
191
+ "Definite=Def|Degree=Pos|Number=Sing|POS=ADJ":84,
192
+ "Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs":90,
193
  "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act":100,
 
 
194
  "POS=ADP|PartType=Inf":85,
195
+ "Definite=Ind|Degree=Pos|Gender=Com|Number=Sing|POS=ADJ":84,
196
+ "NumType=Card|POS=NUM":93,
197
  "Degree=Pos|POS=ADJ":84,
198
+ "Definite=Ind|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part":87,
199
+ "POS=PART|PartType=Inf":94,
200
+ "Case=Acc|POS=PRON|Person=3|PronType=Prs|Reflex=Yes":95,
201
  "Definite=Def|Gender=Com|Number=Plur|POS=NOUN":92,
202
+ "Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN":92,
203
+ "Number[psor]=Plur|POS=DET|Person=3|Poss=Yes|PronType=Prs":90,
204
+ "POS=VERB|Tense=Pres|VerbForm=Part":100,
205
+ "Case=Nom|Number=Plur|POS=PRON|Person=3|PronType=Prs":95,
206
  "Case=Gen|Definite=Def|Gender=Com|Number=Sing|POS=NOUN":92,
207
+ "Definite=Def|Degree=Sup|Number=Plur|POS=ADJ":84,
208
+ "Case=Acc|Number=Plur|POS=PRON|Person=3|PronType=Prs":95,
209
  "POS=AUX|VerbForm=Inf|Voice=Act":87,
210
+ "Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ":84,
211
+ "Definite=Ind|Degree=Cmp|Number=Sing|POS=ADJ":84,
212
+ "Degree=Cmp|POS=ADJ":84,
213
+ "POS=PRON|PartType=Inf":95,
214
+ "Definite=Ind|Degree=Pos|Number=Sing|POS=ADJ":84,
215
+ "Case=Nom|Gender=Com|POS=PRON|PronType=Ind":95,
216
+ "Number=Plur|POS=PRON|PronType=Ind":95,
217
+ "POS=INTJ":91,
218
  "Gender=Com|Number=Sing|POS=DET|PronType=Dem":90,
219
+ "Case=Gen|Number=Plur|POS=DET|PronType=Ind":90,
220
+ "Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Pass":100,
221
+ "Definite=Def|Gender=Neut|Number=Plur|POS=NOUN":92,
222
+ "Degree=Cmp|POS=ADV":86,
223
+ "Number=Plur|Number[psor]=Plur|POS=PRON|Person=1|Poss=Yes|PronType=Prs|Style=Form":95,
224
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs":95,
225
+ "Number=Plur|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":90,
226
+ "Case=Gen|POS=PROPN":96,
227
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Ind":95,
228
+ "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":100,
229
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":90,
230
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs":95,
231
+ "Definite=Def|Degree=Sup|POS=ADJ":84,
232
  "Gender=Neut|Number=Sing|POS=DET|PronType=Ind":90,
233
+ "Case=Gen|Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN":92,
234
+ "Gender=Neut|Number=Sing|POS=DET|PronType=Dem":90,
235
+ "Definite=Def|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":100,
236
+ "POS=PRON|PronType=Dem":95,
237
+ "Degree=Pos|Gender=Com|Number=Sing|POS=ADJ":84,
238
+ "Number=Plur|POS=NUM":93,
239
+ "POS=VERB|VerbForm=Inf|Voice=Pass":100,
240
+ "Definite=Def|Degree=Sup|Number=Sing|POS=ADJ":84,
241
+ "Number=Sing|POS=PRON|PronType=Int,Rel":95,
242
  "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs":95,
243
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":90,
 
 
244
  "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":90,
245
+ "POS=PRON":95,
246
+ "Definite=Ind|Number=Sing|POS=NOUN":92,
247
+ "Definite=Ind|Number=Sing|POS=NUM":93,
248
+ "Case=Gen|Definite=Ind|Gender=Com|Number=Sing|POS=NOUN":92,
249
+ "Foreign=Yes|POS=ADV":86,
250
+ "POS=NOUN":92,
251
+ "Case=Gen|Definite=Def|Gender=Neut|Number=Sing|POS=NOUN":92,
252
+ "Gender=Com|Number=Plur|POS=NOUN":92,
253
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Int,Rel":95,
254
  "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs":95,
255
+ "Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs":90,
256
+ "Gender=Com|Number=Sing|POS=PRON|PronType=Ind":95,
257
+ "Case=Gen|Definite=Ind|Gender=Com|Number=Plur|POS=NOUN":92,
258
+ "Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ":84,
259
+ "Degree=Sup|POS=ADJ":84,
260
+ "Degree=Pos|Number=Sing|POS=ADJ":84,
261
+ "Mood=Imp|POS=VERB":100,
262
+ "Case=Nom|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs":95,
263
+ "Case=Acc|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs":95,
264
+ "POS=X":101,
265
  "Case=Gen|Definite=Def|Gender=Com|Number=Plur|POS=NOUN":92,
 
 
 
 
 
 
 
266
  "Number=Plur|POS=PRON|PronType=Dem":95,
267
+ "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs":95,
268
+ "Number=Plur|POS=PRON|PronType=Int,Rel":95,
269
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":90,
 
 
270
  "Degree=Cmp|Number=Plur|POS=ADJ":84,
271
+ "Number=Plur|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs":90,
 
 
 
272
  "Gender=Com|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":90,
273
+ "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs":95,
274
+ "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs":95,
275
+ "Gender=Com|POS=PRON|PronType=Int,Rel":95,
276
+ "Case=Gen|Degree=Pos|Number=Plur|POS=ADJ":84,
277
+ "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":95,
278
+ "POS=VERB|VerbForm=Ger":100,
279
+ "Gender=Com|Number=Sing|POS=PRON|PronType=Dem":95,
280
+ "Case=Gen|POS=PRON|PronType=Int,Rel":95,
281
+ "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Pass":100,
282
+ "Abbr=Yes|POS=X":101,
283
+ "Case=Gen|Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN":92,
284
+ "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":90,
285
+ "Definite=Ind|Number=Plur|POS=NOUN":92,
286
+ "Foreign=Yes|POS=X":101,
287
  "Number=Plur|POS=PRON|PronType=Rcp":95,
288
+ "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs":95,
289
  "Case=Gen|Degree=Cmp|POS=ADJ":84,
290
  "Case=Gen|Definite=Def|Gender=Neut|Number=Plur|POS=NOUN":92,
291
+ "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs":95,
292
+ "Gender=Neut|Number=Sing|POS=PRON|PronType=Dem":95,
 
 
 
 
 
 
 
293
  "Number=Plur|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":90,
294
+ "Gender=Neut|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form":90,
 
 
 
 
 
295
  "Number=Plur|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes":95,
296
+ "Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs":95,
297
+ "Case=Gen|Number=Plur|POS=PRON|PronType=Rcp":95,
 
 
 
 
298
  "POS=DET|Person=2|Polite=Form|Poss=Yes|PronType=Prs":90,
299
+ "POS=SYM":99,
300
+ "POS=DET|PronType=Dem":90,
301
+ "Gender=Com|Number=Sing|POS=NUM":93,
302
+ "Number[psor]=Plur|POS=DET|Person=2|Poss=Yes|PronType=Prs":90,
303
+ "Case=Gen|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":100,
 
 
304
  "Definite=Def|Degree=Abs|POS=ADJ":84,
305
+ "POS=VERB|Tense=Pres":100,
306
+ "Definite=Ind|Gender=Neut|Number=Sing|POS=NUM":93,
 
 
 
 
307
  "Degree=Abs|POS=ADV":86,
 
 
 
 
308
  "Case=Gen|Definite=Def|Degree=Pos|Number=Sing|POS=ADJ":84,
 
 
309
  "Gender=Com|Number=Sing|POS=PRON|PronType=Int,Rel":95,
310
+ "POS=VERB|Tense=Past|VerbForm=Part":100,
311
+ "Definite=Ind|Degree=Sup|Number=Sing|POS=ADJ":84,
 
 
312
  "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":90,
 
 
 
 
 
 
 
 
 
313
  "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=1|Poss=Yes|PronType=Prs":95,
 
 
314
  "Number=Plur|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs":90,
315
+ "Number[psor]=Plur|POS=PRON|Person=3|Poss=Yes|PronType=Prs":95,
316
+ "Definite=Ind|POS=NOUN":92,
317
  "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Ind":90,
318
+ "Definite=Ind|Gender=Com|Number=Sing|POS=NUM":93,
319
+ "Definite=Def|Number=Plur|POS=NOUN":92,
320
  "Case=Gen|POS=NOUN":92,
321
+ "POS=AUX|Tense=Pres|VerbForm=Part":87
322
+ },
323
+ "overwrite":true
 
324
  }
morphologizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:601cec06d7bb6f1e2025cf6878f5c8fb02d89b5fc71ba82c80e718a28c63f87f
3
- size 161992
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:62f62ceb4605eaee543436b7f396d4ba496714d45057163334ed243f798562f5
3
+ size 163072
ner/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6c7bd95a31a59f7cb632de4a99c12643602828d312d04a7ba233f3bdb7f15778
3
  size 94890
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aed7ed7c8ab350a943c2b814c3eec1a517a9a3413a6c32a4fb16f4ecddd7b933
3
  size 94890
ner/moves CHANGED
@@ -1 +1 @@
1
- ��moves��{"0":{},"1":{"PER":2146,"MISC":1273,"ORG":1267,"LOC":1144},"2":{"PER":2146,"MISC":1273,"ORG":1267,"LOC":1144},"3":{"PER":2146,"MISC":1273,"ORG":1267,"LOC":1144},"4":{"PER":2146,"MISC":1273,"ORG":1267,"LOC":1144,"":1},"5":{"":1}}�cfg��neg_key�
 
1
+ ��moves��{"0":{},"1":{"PER":1361,"ORG":943,"MISC":826,"LOC":768},"2":{"PER":1361,"ORG":943,"MISC":826,"LOC":768},"3":{"PER":1361,"ORG":943,"MISC":826,"LOC":768},"4":{"PER":1361,"ORG":943,"MISC":826,"LOC":768,"":1},"5":{"":1}}�cfg��neg_key�
parser/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:db9711e97c156d5c9892a65b87d6a185289f74b92dcec527cf6906dfb6e821a6
3
- size 325085
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fac2b6c9c6d415229256f7bc3e163329a906f0af57e961f33c988f9dc606046c
3
+ size 826643
parser/moves CHANGED
@@ -1 +1 @@
1
- ��moves�2{"0":{"":41514},"1":{"":34292},"2":{"case":7489,"nsubj":6009,"det":4334,"amod":3968,"advmod":3657,"mark":3529,"aux":2432,"cc":2261,"punct":2182,"cop":1329,"obl":894,"nummod":799,"nmod:poss":651,"nmod":460,"expl":291,"ccomp":202,"obj":195,"xcomp":122,"case||nmod":73,"obl:tmod":53,"dep":49,"acl:relcl":43},"3":{"punct":8600,"obl":3949,"obj":3758,"nmod":3565,"conj":2743,"advmod":2095,"flat":1294,"nsubj":1172,"acl:relcl":1131,"advcl":808,"amod":629,"obl:loc":467,"fixed":390,"dep":322,"xcomp":272,"appos":268,"compound:prt":261,"ccomp":252,"acl:relcl||nsubj":237,"case":202,"nummod":167,"list":161,"nmod:poss":156,"punct||conj":151,"mark":137,"cc":135,"iobj":107,"expl":77,"cop":69,"nmod||case":60,"aux":48,"obl:tmod":45,"cc||case":43,"advcl||advmod":43,"cc||conj":40,"case||obl":38,"punct||case":33},"4":{"ROOT":4367}}�cfg��neg_key�
 
1
+ ��moves��{"0":{"":30710},"1":{"":22084},"2":{"case":5238,"nsubj":4163,"punct":3257,"det":3028,"amod":2815,"advmod":2482,"mark":2317,"aux":1748,"cc":1610,"cop":823,"obl":627,"nummod":620,"nmod:poss":457,"nmod":384,"expl":193,"obj":188,"ccomp":155,"advcl":110,"xcomp":81,"case||nmod":45,"dep":32,"obl:tmod":31},"3":{"punct":4355,"obl":2759,"obj":2659,"nmod":2503,"conj":1923,"advmod":1246,"flat":886,"nsubj":805,"acl:relcl":800,"advcl":744,"amod":415,"xcomp":307,"advmod:lmod":273,"fixed":267,"dep":218,"compound:prt":211,"appos":187,"ccomp":177,"acl:relcl||nsubj":144,"case":130,"nmod:poss":112,"mark":103,"iobj":99,"nummod":93,"list":86,"cc":72,"expl":55,"cop":40,"obl:lmod":35,"obl:tmod":34,"cc||case":31},"4":{"ROOT":2970}}�cfg��neg_key�
span_resolver/cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "nI":256
3
+ }
span_resolver/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9d1a257656d5f4da2c86984b7f404c98f68c677805bdbba025fef0d4784cf7f0
3
+ size 3518661
tagger/cfg ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "labels":[
3
+ "ADJ",
4
+ "ADP",
5
+ "ADV",
6
+ "AUX",
7
+ "CCONJ",
8
+ "DET",
9
+ "INTJ",
10
+ "NOUN",
11
+ "NUM",
12
+ "PART",
13
+ "PRON",
14
+ "PROPN",
15
+ "PUNCT",
16
+ "SCONJ",
17
+ "SYM",
18
+ "VERB",
19
+ "X"
20
+ ],
21
+ "neg_prefix":"!",
22
+ "overwrite":false
23
+ }
tagger/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5b16b720b9df4f6a0745b5add2f9a04e471af75629816daac1afaaac9034c048
3
+ size 18116
tokenizer CHANGED
The diff for this file is too large to render. See raw diff
 
trainable_lemmatizer/cfg ADDED
@@ -0,0 +1,348 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "labels":[
3
+ 1,
4
+ 2,
5
+ 4,
6
+ 6,
7
+ 8,
8
+ 10,
9
+ 12,
10
+ 14,
11
+ 16,
12
+ 18,
13
+ 20,
14
+ 24,
15
+ 26,
16
+ 29,
17
+ 30,
18
+ 34,
19
+ 36,
20
+ 38,
21
+ 42,
22
+ 44,
23
+ 46,
24
+ 48,
25
+ 50,
26
+ 52,
27
+ 54,
28
+ 56,
29
+ 57,
30
+ 60,
31
+ 63,
32
+ 65,
33
+ 67,
34
+ 69,
35
+ 71,
36
+ 73,
37
+ 75,
38
+ 76,
39
+ 78,
40
+ 81,
41
+ 83,
42
+ 84,
43
+ 86,
44
+ 88,
45
+ 92,
46
+ 96,
47
+ 98,
48
+ 100,
49
+ 103,
50
+ 106,
51
+ 108,
52
+ 110,
53
+ 113,
54
+ 115,
55
+ 117,
56
+ 119,
57
+ 121,
58
+ 124,
59
+ 125,
60
+ 127,
61
+ 129,
62
+ 131,
63
+ 133,
64
+ 134,
65
+ 138,
66
+ 140,
67
+ 142,
68
+ 144,
69
+ 146,
70
+ 148,
71
+ 151,
72
+ 153,
73
+ 155,
74
+ 156,
75
+ 159,
76
+ 160,
77
+ 162,
78
+ 164,
79
+ 166,
80
+ 167,
81
+ 168,
82
+ 170,
83
+ 172,
84
+ 175,
85
+ 177,
86
+ 180,
87
+ 182,
88
+ 185,
89
+ 188,
90
+ 190,
91
+ 191,
92
+ 194,
93
+ 197,
94
+ 199,
95
+ 201,
96
+ 205,
97
+ 208,
98
+ 211,
99
+ 212,
100
+ 213,
101
+ 215,
102
+ 217,
103
+ 219,
104
+ 220,
105
+ 221,
106
+ 223,
107
+ 224,
108
+ 226,
109
+ 229,
110
+ 231,
111
+ 232,
112
+ 233,
113
+ 236,
114
+ 238,
115
+ 240,
116
+ 242,
117
+ 244,
118
+ 246,
119
+ 249,
120
+ 250,
121
+ 252,
122
+ 255,
123
+ 256,
124
+ 257,
125
+ 228,
126
+ 259,
127
+ 262,
128
+ 264,
129
+ 266,
130
+ 269,
131
+ 271,
132
+ 274,
133
+ 276,
134
+ 279,
135
+ 281,
136
+ 283,
137
+ 284,
138
+ 285,
139
+ 286,
140
+ 288,
141
+ 289,
142
+ 290,
143
+ 291,
144
+ 293,
145
+ 294,
146
+ 297,
147
+ 298,
148
+ 300,
149
+ 302,
150
+ 303,
151
+ 305,
152
+ 307,
153
+ 308,
154
+ 309,
155
+ 311,
156
+ 312,
157
+ 314,
158
+ 316,
159
+ 318,
160
+ 321,
161
+ 322,
162
+ 323,
163
+ 324,
164
+ 325,
165
+ 327,
166
+ 329,
167
+ 331,
168
+ 333,
169
+ 334,
170
+ 336,
171
+ 338,
172
+ 340,
173
+ 341,
174
+ 343,
175
+ 345,
176
+ 348,
177
+ 351,
178
+ 353,
179
+ 355,
180
+ 356,
181
+ 357,
182
+ 360,
183
+ 362,
184
+ 366,
185
+ 368,
186
+ 370,
187
+ 372,
188
+ 374,
189
+ 376,
190
+ 377,
191
+ 379,
192
+ 381,
193
+ 382,
194
+ 383,
195
+ 384,
196
+ 386,
197
+ 388,
198
+ 389,
199
+ 391,
200
+ 392,
201
+ 395,
202
+ 396,
203
+ 398,
204
+ 400,
205
+ 401,
206
+ 402,
207
+ 403,
208
+ 405,
209
+ 407,
210
+ 408,
211
+ 409,
212
+ 410,
213
+ 412,
214
+ 414,
215
+ 415,
216
+ 418,
217
+ 419,
218
+ 421,
219
+ 422,
220
+ 425,
221
+ 427,
222
+ 428,
223
+ 430,
224
+ 432,
225
+ 433,
226
+ 435,
227
+ 436,
228
+ 438,
229
+ 440,
230
+ 442,
231
+ 443,
232
+ 444,
233
+ 445,
234
+ 446,
235
+ 448,
236
+ 451,
237
+ 452,
238
+ 453,
239
+ 455,
240
+ 456,
241
+ 458,
242
+ 461,
243
+ 463,
244
+ 464,
245
+ 466,
246
+ 467,
247
+ 468,
248
+ 469,
249
+ 470,
250
+ 473,
251
+ 475,
252
+ 479,
253
+ 481,
254
+ 482,
255
+ 486,
256
+ 488,
257
+ 490,
258
+ 492,
259
+ 493,
260
+ 495,
261
+ 497,
262
+ 502,
263
+ 503,
264
+ 504,
265
+ 505,
266
+ 508,
267
+ 509,
268
+ 510,
269
+ 511,
270
+ 512,
271
+ 513,
272
+ 514,
273
+ 515,
274
+ 516,
275
+ 518,
276
+ 519,
277
+ 521,
278
+ 522,
279
+ 524,
280
+ 528,
281
+ 529,
282
+ 530,
283
+ 534,
284
+ 535,
285
+ 537,
286
+ 539,
287
+ 542,
288
+ 544,
289
+ 547,
290
+ 549,
291
+ 550,
292
+ 551,
293
+ 553,
294
+ 555,
295
+ 556,
296
+ 557,
297
+ 559,
298
+ 560,
299
+ 562,
300
+ 566,
301
+ 567,
302
+ 568,
303
+ 570,
304
+ 573,
305
+ 576,
306
+ 579,
307
+ 581,
308
+ 583,
309
+ 584,
310
+ 585,
311
+ 587,
312
+ 590,
313
+ 592,
314
+ 595,
315
+ 597,
316
+ 598,
317
+ 599,
318
+ 600,
319
+ 604,
320
+ 606,
321
+ 608,
322
+ 609,
323
+ 611,
324
+ 612,
325
+ 613,
326
+ 614,
327
+ 616,
328
+ 618,
329
+ 619,
330
+ 622,
331
+ 623,
332
+ 624,
333
+ 625,
334
+ 626,
335
+ 627,
336
+ 628,
337
+ 632,
338
+ 635,
339
+ 636,
340
+ 637,
341
+ 639,
342
+ 640,
343
+ 641,
344
+ 642,
345
+ 643,
346
+ 644
347
+ ]
348
+ }
trainable_lemmatizer/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:81413d89a0bab1be8e36dd8e8cf244dfec562d7afca80202bcbde558a746b708
3
+ size 354285
trainable_lemmatizer/trees ADDED
Binary file (68.7 kB). View file
 
transformer/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2b4036c6e7fbb85e3e3f27549cdd461c8edc2d4a2318391b72d393a3e802f782
3
+ size 90665431
transformer/model/config.json DELETED
@@ -1,28 +0,0 @@
1
- {
2
- "_name_or_path": "Maltehb/-l-ctra-danish-electra-small-cased",
3
- "architectures": [
4
- "ElectraForPreTraining"
5
- ],
6
- "attention_probs_dropout_prob": 0.1,
7
- "embedding_size": 128,
8
- "generator_size": "0.25",
9
- "hidden_act": "gelu",
10
- "hidden_dropout_prob": 0.1,
11
- "hidden_size": 256,
12
- "initializer_range": 0.02,
13
- "intermediate_size": 1024,
14
- "layer_norm_eps": 1e-12,
15
- "max_position_embeddings": 512,
16
- "model_type": "electra",
17
- "num_attention_heads": 4,
18
- "num_hidden_layers": 12,
19
- "pad_token_id": 0,
20
- "position_embedding_type": "absolute",
21
- "summary_activation": "gelu",
22
- "summary_last_dropout": 0.1,
23
- "summary_type": "first",
24
- "summary_use_proj": true,
25
- "transformers_version": "4.5.1",
26
- "type_vocab_size": 2,
27
- "vocab_size": 32000
28
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
transformer/model/special_tokens_map.json DELETED
@@ -1 +0,0 @@
1
- {"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"}
 
 
transformer/model/tokenizer_config.json DELETED
@@ -1 +0,0 @@
1
- {"do_lower_case": false, "unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "tokenize_chinese_chars": true, "strip_accents": null, "special_tokens_map_file": null, "full_tokenizer_file": null, "model_max_length": 128, "name_or_path": "Maltehb/-l-ctra-danish-electra-small-cased", "do_basic_tokenize": true, "never_split": null}
 
 
transformer/model/vocab.txt DELETED
The diff for this file is too large to render. See raw diff
 
vocab/lookups.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a6f4a94131759bf84baec98b3347bcef57ffb2d6712f7f3b8f611e9ef4b3df35
3
- size 20402
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:76be8b528d0075f7aae98d6fa57a6d3c83ae480a8469e668d7b0af968995ac71
3
+ size 1
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5b50a86603f748496e4fd87a8aaa203a32bf82d4b3768bf54187ff40de3ca6f9
3
- size 460120
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e23b286fc3491d7a954e1d6156fb738028114856322ad8c1035b2f377762f271
3
+ size 544387
vocab/vectors.cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "mode":"default"
3
+ }