Update spaCy pipeline

Browse files

Files changed (10) hide show

README.md +16 -16
config.cfg +10 -8
en_docusco_spacy_cd-any-py3-none-any.whl +2 -2
meta.json +108 -89
ner/model +0 -0
ner/moves +1 -1
tagger/cfg +24 -3
tagger/model +0 -0
tok2vec/model +1 -1
vocab/strings.json +0 -0

README.md CHANGED Viewed

@@ -14,28 +14,28 @@ model-index:
     metrics:
     - name: NER Precision
       type: precision
-      value: 0.7896141572
     - name: NER Recall
       type: recall
-      value: 0.7757775447
     - name: NER F Score
       type: f_score
-      value: 0.7826346995
   - task:
       name: TAG
       type: token-classification
     metrics:
     - name: TAG (XPOS) Accuracy
       type: accuracy
-      value: 0.9734866573
 ---
 English pipeline for part-of-speech and rhetorical tagging using a smaller 'common dictionary'.
 | Feature | Description |
 | --- | --- |
 | **Name** | `en_docusco_spacy_cd` |
-| **Version** | `1.2` |
-| **spaCy** | `>=3.5.0,<3.6.0` |
 | **Default Pipeline** | `tok2vec`, `tagger`, `ner` |
 | **Components** | `tok2vec`, `tagger`, `ner` |
 | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
@@ -47,12 +47,12 @@ English pipeline for part-of-speech and rhetorical tagging using a smaller 'comm
 <details>
-<summary>View label scheme (270 labels for 2 components)</summary>
 | Component | Labels |
 | --- | --- |
-| **`tagger`** | `APPGE`, `AT`, `AT1`, `BCL21`, `BCL22`, `CC`, `CCB`, `CS`, `CS21`, `CS22`, `CS31`, `CS32`, `CS33`, `CS41`, `CS42`, `CS43`, `CS44`, `CSA`, `CSN`, `CST`, `CSW`, `CSW31`, `CSW32`, `CSW33`, `DA`, `DA1`, `DA2`, `DAR`, `DAT`, `DB`, `DB2`, `DD`, `DD1`, `DD2`, `DDQ`, `DDQGE`, `DDQV`, `DDQV31`, `DDQV32`, `DDQV33`, `EX`, `FO`, `FU`, `FW`, `GE`, `IF`, `II`, `II21`, `II22`, `II31`, `II32`, `II33`, `II41`, `II42`, `II43`, `II44`, `IO`, `IW`, `JJ`, `JJ21`, `JJ22`, `JJ31`, `JJ32`, `JJ33`, `JJR`, `JJT`, `JK`, `MC`, `MC1`, `MC2`, `MC221`, `MC222`, `MCMC`, `MD`, `MF`, `ND1`, `NN`, `NN1`, `NN121`, `NN122`, `NN131`, `NN132`, `NN133`, `NN141`, `NN142`, `NN143`, `NN144`, `NN2`, `NN21`, `NN22`, `NN221`, `NN222`, `NN231`, `NN232`, `NN233`, `NN31`, `NN33`, `NNA`, `NNB`, `NNL1`, `NNL2`, `NNO`, `NNO2`, `NNT1`, `NNT2`, `NNU`, `NNU1`, `NNU2`, `NNU21`, `NNU22`, `NP`, `NP1`, `NP2`, `NPD1`, `NPD2`, `NPM1`, `NPM2`, `PN`, `PN1`, `PN121`, `PN122`, `PN21`, `PN22`, `PNQO`, `PNQS`, `PNQS31`, `PNQS32`, `PNQS33`, `PNQV`, `PNX1`, `PPGE`, `PPH1`, `PPHO1`, `PPHO2`, `PPHS1`, `PPHS2`, `PPIO1`, `PPIO2`, `PPIS1`, `PPIS2`, `PPX1`, `PPX121`, `PPX122`, `PPX2`, `PPX221`, `PPX222`, `PPY`, `RA`, `RA21`, `RA22`, `REX`, `REX21`, `REX22`, `REX41`, `REX42`, `REX43`, `REX44`, `RG`, `RG21`, `RG22`, `RGQ`, `RGQV`, `RGQV31`, `RGQV32`, `RGQV33`, `RGR`, `RGT`, `RL`, `RL21`, `RL22`, `RP`, `RPK`, `RR`, `RR21`, `RR22`, `RR31`, `RR32`, `RR33`, `RR41`, `RR42`, `RR43`, `RR44`, `RR51`, `RR52`, `RR53`, `RR54`, `RR55`, `RRQ`, `RRQV`, `RRQV31`, `RRQV32`, `RRQV33`, `RRR`, `RRT`, `RT`, `RT21`, `RT22`, `RT31`, `RT32`, `RT33`, `RT41`, `RT42`, `RT43`, `RT44`, `TO`, `UH`, `UH21`, `UH22`, `UH31`, `UH32`, `UH33`, `VB0`, `VBDR`, `VBDZ`, `VBG`, `VBI`, `VBM`, `VBN`, `VBR`, `VBZ`, `VD0`, `VDD`, `VDG`, `VDI`, `VDN`, `VDZ`, `VH0`, `VHD`, `VHG`, `VHI`, `VHN`, `VHZ`, `VM`, `VM21`, `VM22`, `VMK`, `VV0`, `VVD`, `VVG`, `VVGK`, `VVI`, `VVN`, `VVNK`, `VVZ`, `XX`, `Y`, `ZZ1`, `ZZ2`, `ZZ221`, `ZZ222` |
-| **`ner`** | `ActorsAbstractions`, `ActorsFirstPerson`, `ActorsPeople`, `ActorsPublicEntities`, `CitationAuthority`, `CitationControversy`, `CitationHedged`, `CitationNeutral`, `ConfidenceHedged`, `ConfidenceHigh`, `OrganizationNarrative`, `OrganizationReasoning`, `PlanningFuture`, `PlanningStrategy`, `SentimentNegative`, `SentimentPositive`, `SignpostingAcademicWritingMoves`, `SignpostingMetadiscourse`, `StanceEmphatic`, `StanceModerated` |
 </details>
@@ -60,10 +60,10 @@ English pipeline for part-of-speech and rhetorical tagging using a smaller 'comm
 | Type | Score |
 | --- | --- |
-| `TAG_ACC` | 97.35 |
-| `ENTS_F` | 78.26 |
-| `ENTS_P` | 78.96 |
-| `ENTS_R` | 77.58 |
-| `TOK2VEC_LOSS` | 5937424.94 |
-| `TAGGER_LOSS` | 1136040.49 |
-| `NER_LOSS` | 3941726.32 |

     metrics:
     - name: NER Precision
       type: precision
+      value: 0.8206658604
     - name: NER Recall
       type: recall
+      value: 0.80740266
     - name: NER F Score
       type: f_score
+      value: 0.8139802353
   - task:
       name: TAG
       type: token-classification
     metrics:
     - name: TAG (XPOS) Accuracy
       type: accuracy
+      value: 0.9763683149
 ---
 English pipeline for part-of-speech and rhetorical tagging using a smaller 'common dictionary'.
 | Feature | Description |
 | --- | --- |
 | **Name** | `en_docusco_spacy_cd` |
+| **Version** | `1.3` |
+| **spaCy** | `>=3.7.4,<3.8.0` |
 | **Default Pipeline** | `tok2vec`, `tagger`, `ner` |
 | **Components** | `tok2vec`, `tagger`, `ner` |
 | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
 <details>
+<summary>View label scheme (289 labels for 2 components)</summary>
 | Component | Labels |
 | --- | --- |
+| **`tagger`** | `APPGE`, `AT`, `AT1`, `BCL21`, `BCL22`, `CC`, `CCB`, `CS`, `CS21`, `CS22`, `CS31`, `CS32`, `CS33`, `CS41`, `CS42`, `CS43`, `CS44`, `CSA`, `CSN`, `CST`, `CSW`, `CSW31`, `CSW32`, `CSW33`, `DA`, `DA1`, `DA2`, `DAR`, `DAT`, `DB`, `DB2`, `DD`, `DD1`, `DD2`, `DDQ`, `DDQGE`, `DDQGE31`, `DDQGE32`, `DDQGE33`, `DDQV`, `DDQV31`, `DDQV32`, `DDQV33`, `EX`, `FO`, `FU`, `FW`, `GE`, `IF`, `II`, `II21`, `II22`, `II31`, `II32`, `II33`, `II41`, `II42`, `II43`, `II44`, `IO`, `IW`, `JJ`, `JJ21`, `JJ22`, `JJ31`, `JJ32`, `JJ33`, `JJ41`, `JJ42`, `JJ43`, `JJ44`, `JJR`, `JJT`, `JK`, `MC`, `MC1`, `MC121`, `MC122`, `MC2`, `MC221`, `MC222`, `MCMC`, `MD`, `MF`, `ND1`, `NN`, `NN1`, `NN121`, `NN122`, `NN131`, `NN132`, `NN133`, `NN141`, `NN142`, `NN143`, `NN144`, `NN2`, `NN21`, `NN22`, `NN221`, `NN222`, `NN31`, `NN32`, `NN33`, `NNA`, `NNB`, `NNL1`, `NNL2`, `NNO`, `NNO2`, `NNT1`, `NNT131`, `NNT132`, `NNT133`, `NNT2`, `NNU`, `NNU1`, `NNU2`, `NNU21`, `NNU22`, `NP`, `NP1`, `NP2`, `NPD1`, `NPD2`, `NPM1`, `NPM2`, `PN`, `PN1`, `PN121`, `PN122`, `PN21`, `PN22`, `PNQO`, `PNQS`, `PNQS31`, `PNQS32`, `PNQS33`, `PNQV`, `PNQV31`, `PNQV32`, `PNQV33`, `PNX1`, `PPGE`, `PPH1`, `PPHO1`, `PPHO2`, `PPHS1`, `PPHS2`, `PPIO1`, `PPIO2`, `PPIS1`, `PPIS2`, `PPX1`, `PPX121`, `PPX122`, `PPX2`, `PPX221`, `PPX222`, `PPY`, `RA`, `RA21`, `RA22`, `REX`, `REX21`, `REX22`, `REX41`, `REX42`, `REX43`, `REX44`, `RG`, `RG21`, `RG22`, `RG41`, `RG42`, `RG43`, `RG44`, `RGQ`, `RGQV`, `RGQV31`, `RGQV32`, `RGQV33`, `RGR`, `RGT`, `RL`, `RL21`, `RL22`, `RL31`, `RL32`, `RL33`, `RP`, `RPK`, `RR`, `RR21`, `RR22`, `RR31`, `RR32`, `RR33`, `RR41`, `RR42`, `RR43`, `RR44`, `RR51`, `RR52`, `RR53`, `RR54`, `RR55`, `RRQ`, `RRQV`, `RRQV31`, `RRQV32`, `RRQV33`, `RRR`, `RRT`, `RT`, `RT21`, `RT22`, `RT31`, `RT32`, `RT33`, `RT41`, `RT42`, `RT43`, `RT44`, `TO`, `UH`, `UH21`, `UH22`, `UH31`, `UH32`, `UH33`, `VB0`, `VBDR`, `VBDZ`, `VBG`, `VBI`, `VBM`, `VBN`, `VBR`, `VBZ`, `VD0`, `VDD`, `VDG`, `VDI`, `VDN`, `VDZ`, `VH0`, `VHD`, `VHG`, `VHI`, `VHN`, `VHZ`, `VM`, `VM21`, `VM22`, `VMK`, `VV0`, `VVD`, `VVG`, `VVGK`, `VVI`, `VVN`, `VVNK`, `VVZ`, `XX`, `Y`, `ZZ1`, `ZZ2`, `ZZ221`, `ZZ222` |
+| **`ner`** | `ActorsAbstractions`, `ActorsFirstPerson`, `ActorsPeople`, `ActorsPublicEntities`, `CitationAuthority`, `CitationControversy`, `CitationNeutral`, `ConfidenceHedged`, `ConfidenceHigh`, `OrganizationNarrative`, `OrganizationReasoning`, `PlanningFuture`, `PlanningStrategy`, `SentimentNegative`, `SentimentPositive`, `SignpostingAcademicWritingMoves`, `SignpostingMetadiscourse`, `StanceEmphatic`, `StanceModerated` |
 </details>
 | Type | Score |
 | --- | --- |
+| `TAG_ACC` | 97.64 |
+| `ENTS_F` | 81.40 |
+| `ENTS_P` | 82.07 |
+| `ENTS_R` | 80.74 |
+| `TOK2VEC_LOSS` | 150973939.97 |
+| `TAGGER_LOSS` | 3936874.26 |
+| `NER_LOSS` | 12742855.43 |

config.cfg CHANGED Viewed

@@ -1,6 +1,6 @@
 [paths]
-train = ""
-dev = ""
 vectors = null
 init_tok2vec = null
@@ -17,6 +17,7 @@ before_creation = null
 after_creation = null
 after_pipeline_creation = null
 tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
 [components]
@@ -43,6 +44,7 @@ upstream = "*"
 [components.tagger]
 factory = "tagger"
 neg_prefix = "!"
 overwrite = false
 scorer = {"@scorers":"spacy.tagger_scorer.v1"}
@@ -102,10 +104,10 @@ seed = ${system.seed}
 gpu_allocator = ${system.gpu_allocator}
 dropout = 0.1
 accumulate_gradient = 1
-patience = 1600
-max_epochs = 0
-max_steps = 35000
-eval_frequency = 250
 frozen_components = []
 annotating_components = []
 before_to_disk = null
@@ -140,8 +142,8 @@ eps = 0.00000001
 learn_rate = 0.001
 [training.score_weights]
-tag_acc = 0.5
-ents_f = 0.5
 ents_p = 0.0
 ents_r = 0.0
 ents_per_type = null

 [paths]
+train = "spacy_train_07.spacy"
+dev = "spacy_dev_07.spacy"
 vectors = null
 init_tok2vec = null
 after_creation = null
 after_pipeline_creation = null
 tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
+vectors = {"@vectors":"spacy.Vectors.v1"}
 [components]
 [components.tagger]
 factory = "tagger"
+label_smoothing = 0.05
 neg_prefix = "!"
 overwrite = false
 scorer = {"@scorers":"spacy.tagger_scorer.v1"}
 gpu_allocator = ${system.gpu_allocator}
 dropout = 0.1
 accumulate_gradient = 1
+patience = 20000
+max_epochs = -1
+max_steps = 80000
+eval_frequency = 1000
 frozen_components = []
 annotating_components = []
 before_to_disk = null
 learn_rate = 0.001
 [training.score_weights]
+tag_acc = 0.4
+ents_f = 0.6
 ents_p = 0.0
 ents_r = 0.0
 ents_per_type = null

en_docusco_spacy_cd-any-py3-none-any.whl CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f1fc8685dd2f705596754da88344d1b98cb6283710de43c2cc6f2710c0ce5b8e
-size 6956762

 version https://git-lfs.github.com/spec/v1
+oid sha256:3c4ba27b27fa3effb8af587c05fc1d6a1a7a312ced5d884a65cbb048a84e8a93
+size 8394802

meta.json CHANGED Viewed

@@ -1,13 +1,14 @@
 {
   "lang":"en",
   "name":"docusco_spacy_cd",
-  "version":"1.2",
   "description":"English pipeline for part-of-speech and rhetorical tagging using a smaller 'common dictionary'.",
   "author":"David Brown",
   "email":"dwb2@andrew.cmu.edu",
   "url":"https://docuscope.github.io",
   "license":"MIT",
-  "spacy_git_version":"Unknown",
   "vectors":{
     "width":0,
     "vectors":0,
@@ -55,6 +56,9 @@
       "DD2",
       "DDQ",
       "DDQGE",
       "DDQV",
       "DDQV31",
       "DDQV32",
@@ -83,11 +87,17 @@
       "JJ31",
       "JJ32",
       "JJ33",
       "JJR",
       "JJT",
       "JK",
       "MC",
       "MC1",
       "MC2",
       "MC221",
       "MC222",
@@ -111,10 +121,8 @@
       "NN22",
       "NN221",
       "NN222",
-      "NN231",
-      "NN232",
-      "NN233",
       "NN31",
       "NN33",
       "NNA",
       "NNB",
@@ -123,6 +131,9 @@
       "NNO",
       "NNO2",
       "NNT1",
       "NNT2",
       "NNU",
       "NNU1",
@@ -148,6 +159,9 @@
       "PNQS32",
       "PNQS33",
       "PNQV",
       "PNX1",
       "PPGE",
       "PPH1",
@@ -179,6 +193,10 @@
       "RG",
       "RG21",
       "RG22",
       "RGQ",
       "RGQV",
       "RGQV31",
@@ -189,6 +207,9 @@
       "RL",
       "RL21",
       "RL22",
       "RP",
       "RPK",
       "RR",
@@ -277,7 +298,6 @@
       "ActorsPublicEntities",
       "CitationAuthority",
       "CitationControversy",
-      "CitationHedged",
       "CitationNeutral",
       "ConfidenceHedged",
       "ConfidenceHigh",
@@ -307,112 +327,111 @@
   ],
   "performance":{
-    "tag_acc":0.9734866573,
-    "ents_f":0.7826346995,
-    "ents_p":0.7896141572,
-    "ents_r":0.7757775447,
     "ents_per_type":{
-      "ActorsFirstPerson":{
-        "p":0.8180863993,
-        "r":0.8443950279,
-        "f":0.8310325477
-      },
       "ActorsPeople":{
-        "p":0.855646716,
-        "r":0.8830180997,
-        "f":0.8691169573
       },
-      "CitationNeutral":{
-        "p":0.7527158376,
-        "r":0.7425267908,
-        "f":0.7475865985
       },
-      "SentimentNegative":{
-        "p":0.7090227054,
-        "r":0.6414706491,
-        "f":0.6735571909
       },
-      "ActorsAbstractions":{
-        "p":0.7691966267,
-        "r":0.8180602299,
-        "f":0.7928762968
       },
-      "PlanningStrategy":{
-        "p":0.721360087,
-        "r":0.6471005497,
-        "f":0.6822154709
       },
-      "SignpostingAcademicWritingMoves":{
-        "p":0.6508373967,
-        "r":0.5615511249,
-        "f":0.60290652
       },
-      "OrganizationNarrative":{
-        "p":0.7794650481,
-        "r":0.7044971007,
-        "f":0.74008743
       },
-      "PlanningFuture":{
-        "p":0.7692376361,
-        "r":0.7372518823,
-        "f":0.7529051988
       },
-      "ConfidenceHedged":{
-        "p":0.8042735043,
-        "r":0.7942268737,
-        "f":0.7992186173
       },
-      "SentimentPositive":{
-        "p":0.7252046892,
-        "r":0.6383048418,
-        "f":0.678985594
       },
-      "StanceEmphatic":{
-        "p":0.7869956077,
-        "r":0.8246021505,
-        "f":0.8053601059
       },
-      "SignpostingMetadiscourse":{
-        "p":0.9005463375,
-        "r":0.8505900903,
-        "f":0.8748556405
       },
-      "CitationControversy":{
-        "p":0.7772643253,
-        "r":0.7109044801,
-        "f":0.7426048565
       },
-      "ActorsPublicEntities":{
-        "p":0.8000974738,
-        "r":0.7844855049,
-        "f":0.7922145816
       },
-      "OrganizationReasoning":{
-        "p":0.8085699285,
-        "r":0.8050261359,
-        "f":0.8067941408
       },
-      "ConfidenceHigh":{
-        "p":0.7475272184,
-        "r":0.7257841647,
-        "f":0.7364952501
       },
-      "CitationAuthority":{
-        "p":0.7098366882,
-        "r":0.6263404826,
-        "f":0.6654797935
       },
       "StanceModerated":{
-        "p":0.7352631579,
-        "r":0.7569764292,
-        "f":0.7459618209
       }
     },
-    "tok2vec_loss":59374.2493771126,
-    "tagger_loss":11360.4048709869,
-    "ner_loss":39417.2632061559
   },
-  "spacy_version":">=3.5.0,<3.6.0",
   "requirements":[
   ]

 {
   "lang":"en",
   "name":"docusco_spacy_cd",
+  "version":"1.3",
   "description":"English pipeline for part-of-speech and rhetorical tagging using a smaller 'common dictionary'.",
   "author":"David Brown",
   "email":"dwb2@andrew.cmu.edu",
   "url":"https://docuscope.github.io",
   "license":"MIT",
+  "spacy_version":">=3.7.4,<3.8.0",
+  "spacy_git_version":"bff8725f4",
   "vectors":{
     "width":0,
     "vectors":0,
       "DD2",
       "DDQ",
       "DDQGE",
+      "DDQGE31",
+      "DDQGE32",
+      "DDQGE33",
       "DDQV",
       "DDQV31",
       "DDQV32",
       "JJ31",
       "JJ32",
       "JJ33",
+      "JJ41",
+      "JJ42",
+      "JJ43",
+      "JJ44",
       "JJR",
       "JJT",
       "JK",
       "MC",
       "MC1",
+      "MC121",
+      "MC122",
       "MC2",
       "MC221",
       "MC222",
       "NN22",
       "NN221",
       "NN222",
       "NN31",
+      "NN32",
       "NN33",
       "NNA",
       "NNB",
       "NNO",
       "NNO2",
       "NNT1",
+      "NNT131",
+      "NNT132",
+      "NNT133",
       "NNT2",
       "NNU",
       "NNU1",
       "PNQS32",
       "PNQS33",
       "PNQV",
+      "PNQV31",
+      "PNQV32",
+      "PNQV33",
       "PNX1",
       "PPGE",
       "PPH1",
       "RG",
       "RG21",
       "RG22",
+      "RG41",
+      "RG42",
+      "RG43",
+      "RG44",
       "RGQ",
       "RGQV",
       "RGQV31",
       "RL",
       "RL21",
       "RL22",
+      "RL31",
+      "RL32",
+      "RL33",
       "RP",
       "RPK",
       "RR",
       "ActorsPublicEntities",
       "CitationAuthority",
       "CitationControversy",
       "CitationNeutral",
       "ConfidenceHedged",
       "ConfidenceHigh",
   ],
   "performance":{
+    "tag_acc":0.9763683149,
+    "ents_f":0.8139802353,
+    "ents_p":0.8206658604,
+    "ents_r":0.80740266,
     "ents_per_type":{
       "ActorsPeople":{
+        "p":0.8542168374,
+        "r":0.8696353974,
+        "f":0.8618571637
       },
+      "ActorsPublicEntities":{
+        "p":0.8169841646,
+        "r":0.8246103931,
+        "f":0.8207795646
       },
+      "OrganizationReasoning":{
+        "p":0.8497536946,
+        "r":0.8395089039,
+        "f":0.8446002337
       },
+      "ActorsFirstPerson":{
+        "p":0.8645147555,
+        "r":0.8759769676,
+        "f":0.8702081187
       },
+      "ConfidenceHedged":{
+        "p":0.8414330099,
+        "r":0.849020822,
+        "f":0.8452098865
       },
+      "SentimentPositive":{
+        "p":0.7541410809,
+        "r":0.6988926856,
+        "f":0.7254665342
       },
+      "SignpostingMetadiscourse":{
+        "p":0.9222331178,
+        "r":0.8799657453,
+        "f":0.9006037785
       },
+      "ActorsAbstractions":{
+        "p":0.812620511,
+        "r":0.8397741356,
+        "f":0.825974217
       },
+      "CitationAuthority":{
+        "p":0.7421895511,
+        "r":0.6606683805,
+        "f":0.6990603363
       },
+      "SentimentNegative":{
+        "p":0.7569732066,
+        "r":0.681115792,
+        "f":0.7170438069
       },
+      "OrganizationNarrative":{
+        "p":0.8146691347,
+        "r":0.7606297812,
+        "f":0.7867225698
       },
+      "StanceEmphatic":{
+        "p":0.8325835219,
+        "r":0.8587117676,
+        "f":0.8454458216
       },
+      "ConfidenceHigh":{
+        "p":0.793492611,
+        "r":0.7964435325,
+        "f":0.7949653333
       },
+      "PlanningFuture":{
+        "p":0.8015720524,
+        "r":0.7731229292,
+        "f":0.7870905037
       },
+      "SignpostingAcademicWritingMoves":{
+        "p":0.6799470549,
+        "r":0.6417239225,
+        "f":0.6602827763
       },
+      "PlanningStrategy":{
+        "p":0.7405392335,
+        "r":0.7067443605,
+        "f":0.7232472325
       },
+      "CitationNeutral":{
+        "p":0.8012995179,
+        "r":0.7580805076,
+        "f":0.7790910944
       },
       "StanceModerated":{
+        "p":0.8127539304,
+        "r":0.8244042286,
+        "f":0.8185376268
+      },
+      "CitationControversy":{
+        "p":0.7450381679,
+        "r":0.7160674982,
+        "f":0.7302656192
       }
     },
+    "tok2vec_loss":1509739.3996848087,
+    "tagger_loss":39368.7426280975,
+    "ner_loss":127428.554314194
   },
   "requirements":[
   ]

ner/model CHANGED Viewed

Binary files a/ner/model and b/ner/model differ

ner/moves CHANGED Viewed

@@ -1 +1 @@

- ��moves~~��~~{"0":{},"1":{"~~ActorsAbstractions~~":~~574624~~,"~~SentimentNegative~~":~~498816~~,"~~ActorsPeople~~":~~490889~~,"~~SentimentPositive~~":~~329200~~,"~~OrganizationNarrative~~":~~327795~~,"SignpostingMetadiscourse":~~287016~~,"ActorsFirstPerson":~~242625~~,"OrganizationReasoning":~~182969~~,"StanceEmphatic":~~148909~~,"ActorsPublicEntities":~~141388~~,"ConfidenceHedged":~~132889~~,"ConfidenceHigh":~~117539~~,"PlanningFuture":~~91199~~,"PlanningStrategy":~~77436~~,"SignpostingAcademicWritingMoves":~~45355~~,"CitationNeutral":~~28827~~,"StanceModerated":~~24999~~,"CitationAuthority":~~24695~~,"CitationControversy":~~7780,"CitationHedged":3~~},"2":{"~~ActorsAbstractions~~":~~574624~~,"~~SentimentNegative~~":~~498816~~,"~~ActorsPeople~~":~~490889~~,"~~SentimentPositive~~":~~329200~~,"~~OrganizationNarrative~~":~~327795~~,"SignpostingMetadiscourse":~~287016~~,"ActorsFirstPerson":~~242625~~,"OrganizationReasoning":~~182969~~,"StanceEmphatic":~~148909~~,"ActorsPublicEntities":~~141388~~,"ConfidenceHedged":~~132889~~,"ConfidenceHigh":~~117539~~,"PlanningFuture":~~91199~~,"PlanningStrategy":~~77436~~,"SignpostingAcademicWritingMoves":~~45355~~,"CitationNeutral":~~28827~~,"StanceModerated":~~24999~~,"CitationAuthority":~~24695~~,"CitationControversy":~~7780,"CitationHedged":3~~},"3":{"~~ActorsAbstractions~~":~~574624~~,"~~SentimentNegative~~":~~498816~~,"~~ActorsPeople~~":~~490889~~,"~~SentimentPositive~~":~~329200~~,"~~OrganizationNarrative~~":~~327795~~,"SignpostingMetadiscourse":~~287016~~,"ActorsFirstPerson":~~242625~~,"OrganizationReasoning":~~182969~~,"StanceEmphatic":~~148909~~,"ActorsPublicEntities":~~141388~~,"ConfidenceHedged":~~132889~~,"ConfidenceHigh":~~117539~~,"PlanningFuture":~~91199~~,"PlanningStrategy":~~77436~~,"SignpostingAcademicWritingMoves":~~45355~~,"CitationNeutral":~~28827~~,"StanceModerated":~~24999~~,"CitationAuthority":~~24695~~,"CitationControversy":~~7780,"CitationHedged":3~~},"4":{"~~ActorsAbstractions~~":~~574624~~,"~~SentimentNegative~~":~~498816~~,"~~ActorsPeople~~":~~490889~~,"~~SentimentPositive~~":~~329200~~,"~~OrganizationNarrative~~":~~327795~~,"SignpostingMetadiscourse":~~287016~~,"ActorsFirstPerson":~~242625~~,"OrganizationReasoning":~~182969~~,"StanceEmphatic":~~148909~~,"ActorsPublicEntities":~~141388~~,"ConfidenceHedged":~~132889~~,"ConfidenceHigh":~~117539~~,"PlanningFuture":~~91199~~,"PlanningStrategy":~~77436~~,"SignpostingAcademicWritingMoves":~~45355~~,"CitationNeutral":~~28827~~,"StanceModerated":~~24999~~,"CitationAuthority":~~24695~~,"CitationControversy":~~7780~~,"~~CitationHedged~~":~~3,"":~~1},"5":{"":1}}�cfg��neg_key�

+ ��moves�t{"0":{},"1":{"ActorsPeople":2252459,"ActorsAbstractions":2160829,"SentimentNegative":1838447,"OrganizationNarrative":1220253,"SentimentPositive":1215068,"SignpostingMetadiscourse":982819,"ActorsFirstPerson":942047,"OrganizationReasoning":603068,"StanceEmphatic":540777,"ActorsPublicEntities":488472,"ConfidenceHedged":449697,"ConfidenceHigh":422991,"PlanningFuture":318827,"PlanningStrategy":277732,"SignpostingAcademicWritingMoves":153321,"CitationNeutral":95864,"StanceModerated":85078,"CitationAuthority":80084,"CitationControversy":22589},"2":{"ActorsPeople":2252459,"ActorsAbstractions":2160829,"SentimentNegative":1838447,"OrganizationNarrative":1220253,"SentimentPositive":1215068,"SignpostingMetadiscourse":982819,"ActorsFirstPerson":942047,"OrganizationReasoning":603068,"StanceEmphatic":540777,"ActorsPublicEntities":488472,"ConfidenceHedged":449697,"ConfidenceHigh":422991,"PlanningFuture":318827,"PlanningStrategy":277732,"SignpostingAcademicWritingMoves":153321,"CitationNeutral":95864,"StanceModerated":85078,"CitationAuthority":80084,"CitationControversy":22589},"3":{"ActorsPeople":2252459,"ActorsAbstractions":2160829,"SentimentNegative":1838447,"OrganizationNarrative":1220253,"SentimentPositive":1215068,"SignpostingMetadiscourse":982819,"ActorsFirstPerson":942047,"OrganizationReasoning":603068,"StanceEmphatic":540777,"ActorsPublicEntities":488472,"ConfidenceHedged":449697,"ConfidenceHigh":422991,"PlanningFuture":318827,"PlanningStrategy":277732,"SignpostingAcademicWritingMoves":153321,"CitationNeutral":95864,"StanceModerated":85078,"CitationAuthority":80084,"CitationControversy":22589},"4":{"ActorsPeople":2252459,"ActorsAbstractions":2160829,"SentimentNegative":1838447,"OrganizationNarrative":1220253,"SentimentPositive":1215068,"SignpostingMetadiscourse":982819,"ActorsFirstPerson":942047,"OrganizationReasoning":603068,"StanceEmphatic":540777,"ActorsPublicEntities":488472,"ConfidenceHedged":449697,"ConfidenceHigh":422991,"PlanningFuture":318827,"PlanningStrategy":277732,"SignpostingAcademicWritingMoves":153321,"CitationNeutral":95864,"StanceModerated":85078,"CitationAuthority":80084,"CitationControversy":22589,"":1},"5":{"":1}}�cfg��neg_key�

tagger/cfg CHANGED Viewed

@@ -1,4 +1,5 @@
 {
   "labels":[
     "APPGE",
     "AT",
@@ -36,6 +37,9 @@
     "DD2",
     "DDQ",
     "DDQGE",
     "DDQV",
     "DDQV31",
     "DDQV32",
@@ -64,11 +68,17 @@
     "JJ31",
     "JJ32",
     "JJ33",
     "JJR",
     "JJT",
     "JK",
     "MC",
     "MC1",
     "MC2",
     "MC221",
     "MC222",
@@ -92,10 +102,8 @@
     "NN22",
     "NN221",
     "NN222",
-    "NN231",
-    "NN232",
-    "NN233",
     "NN31",
     "NN33",
     "NNA",
     "NNB",
@@ -104,6 +112,9 @@
     "NNO",
     "NNO2",
     "NNT1",
     "NNT2",
     "NNU",
     "NNU1",
@@ -129,6 +140,9 @@
     "PNQS32",
     "PNQS33",
     "PNQV",
     "PNX1",
     "PPGE",
     "PPH1",
@@ -160,6 +174,10 @@
     "RG",
     "RG21",
     "RG22",
     "RGQ",
     "RGQV",
     "RGQV31",
@@ -170,6 +188,9 @@
     "RL",
     "RL21",
     "RL22",
     "RP",
     "RPK",
     "RR",

 {
+  "label_smoothing":0.05,
   "labels":[
     "APPGE",
     "AT",
     "DD2",
     "DDQ",
     "DDQGE",
+    "DDQGE31",
+    "DDQGE32",
+    "DDQGE33",
     "DDQV",
     "DDQV31",
     "DDQV32",
     "JJ31",
     "JJ32",
     "JJ33",
+    "JJ41",
+    "JJ42",
+    "JJ43",
+    "JJ44",
     "JJR",
     "JJT",
     "JK",
     "MC",
     "MC1",
+    "MC121",
+    "MC122",
     "MC2",
     "MC221",
     "MC222",
     "NN22",
     "NN221",
     "NN222",
     "NN31",
+    "NN32",
     "NN33",
     "NNA",
     "NNB",
     "NNO",
     "NNO2",
     "NNT1",
+    "NNT131",
+    "NNT132",
+    "NNT133",
     "NNT2",
     "NNU",
     "NNU1",
     "PNQS32",
     "PNQS33",
     "PNQV",
+    "PNQV31",
+    "PNQV32",
+    "PNQV33",
     "PNX1",
     "PPGE",
     "PPH1",
     "RG",
     "RG21",
     "RG22",
+    "RG41",
+    "RG42",
+    "RG43",
+    "RG44",
     "RGQ",
     "RGQV",
     "RGQV31",
     "RL",
     "RL21",
     "RL22",
+    "RL31",
+    "RL32",
+    "RL33",
     "RP",
     "RPK",
     "RR",

tagger/model CHANGED Viewed

Binary files a/tagger/model and b/tagger/model differ

tok2vec/model CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2bef1c838277d6641b02d24484bfc90fc6cab1da7c6972fdcb9ddd1d37318a30
 size 6009091

 version https://git-lfs.github.com/spec/v1
+oid sha256:58e0806e259d1699a33eb0804db3d207aea31ea5aba7826c5f32b62076f718c4
 size 6009091

vocab/strings.json CHANGED Viewed

The diff for this file is too large to render. See raw diff