flair-hipe-2022-ajmc-en / training.log
stefan-it's picture
Upload folder using huggingface_hub
5ef0bb3
2023-10-17 09:40:41,085 ----------------------------------------------------------------------------------------------------
2023-10-17 09:40:41,086 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 09:40:41,086 ----------------------------------------------------------------------------------------------------
2023-10-17 09:40:41,086 MultiCorpus: 1214 train + 266 dev + 251 test sentences
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator
2023-10-17 09:40:41,086 ----------------------------------------------------------------------------------------------------
2023-10-17 09:40:41,086 Train: 1214 sentences
2023-10-17 09:40:41,086 (train_with_dev=False, train_with_test=False)
2023-10-17 09:40:41,086 ----------------------------------------------------------------------------------------------------
2023-10-17 09:40:41,086 Training Params:
2023-10-17 09:40:41,086 - learning_rate: "3e-05"
2023-10-17 09:40:41,086 - mini_batch_size: "4"
2023-10-17 09:40:41,087 - max_epochs: "10"
2023-10-17 09:40:41,087 - shuffle: "True"
2023-10-17 09:40:41,087 ----------------------------------------------------------------------------------------------------
2023-10-17 09:40:41,087 Plugins:
2023-10-17 09:40:41,087 - TensorboardLogger
2023-10-17 09:40:41,087 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 09:40:41,087 ----------------------------------------------------------------------------------------------------
2023-10-17 09:40:41,087 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 09:40:41,087 - metric: "('micro avg', 'f1-score')"
2023-10-17 09:40:41,087 ----------------------------------------------------------------------------------------------------
2023-10-17 09:40:41,087 Computation:
2023-10-17 09:40:41,087 - compute on device: cuda:0
2023-10-17 09:40:41,087 - embedding storage: none
2023-10-17 09:40:41,087 ----------------------------------------------------------------------------------------------------
2023-10-17 09:40:41,087 Model training base path: "hmbench-ajmc/en-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-17 09:40:41,087 ----------------------------------------------------------------------------------------------------
2023-10-17 09:40:41,087 ----------------------------------------------------------------------------------------------------
2023-10-17 09:40:41,087 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 09:40:42,550 epoch 1 - iter 30/304 - loss 4.00925954 - time (sec): 1.46 - samples/sec: 2025.07 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:40:44,107 epoch 1 - iter 60/304 - loss 3.29571244 - time (sec): 3.02 - samples/sec: 2034.37 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:40:45,575 epoch 1 - iter 90/304 - loss 2.50210834 - time (sec): 4.49 - samples/sec: 2078.76 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:40:46,958 epoch 1 - iter 120/304 - loss 2.06250957 - time (sec): 5.87 - samples/sec: 2083.13 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:40:48,324 epoch 1 - iter 150/304 - loss 1.74686239 - time (sec): 7.24 - samples/sec: 2147.08 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:40:49,607 epoch 1 - iter 180/304 - loss 1.53869916 - time (sec): 8.52 - samples/sec: 2147.16 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:40:50,903 epoch 1 - iter 210/304 - loss 1.37891942 - time (sec): 9.81 - samples/sec: 2169.99 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:40:52,181 epoch 1 - iter 240/304 - loss 1.24625376 - time (sec): 11.09 - samples/sec: 2195.01 - lr: 0.000024 - momentum: 0.000000
2023-10-17 09:40:53,524 epoch 1 - iter 270/304 - loss 1.13828827 - time (sec): 12.44 - samples/sec: 2212.10 - lr: 0.000027 - momentum: 0.000000
2023-10-17 09:40:54,853 epoch 1 - iter 300/304 - loss 1.04580906 - time (sec): 13.76 - samples/sec: 2231.07 - lr: 0.000030 - momentum: 0.000000
2023-10-17 09:40:55,031 ----------------------------------------------------------------------------------------------------
2023-10-17 09:40:55,031 EPOCH 1 done: loss 1.0377 - lr: 0.000030
2023-10-17 09:40:56,017 DEV : loss 0.196847066283226 - f1-score (micro avg) 0.5828
2023-10-17 09:40:56,027 saving best model
2023-10-17 09:40:56,403 ----------------------------------------------------------------------------------------------------
2023-10-17 09:40:57,759 epoch 2 - iter 30/304 - loss 0.22973336 - time (sec): 1.35 - samples/sec: 2261.94 - lr: 0.000030 - momentum: 0.000000
2023-10-17 09:40:59,251 epoch 2 - iter 60/304 - loss 0.19011469 - time (sec): 2.85 - samples/sec: 2194.87 - lr: 0.000029 - momentum: 0.000000
2023-10-17 09:41:00,618 epoch 2 - iter 90/304 - loss 0.18365433 - time (sec): 4.21 - samples/sec: 2166.12 - lr: 0.000029 - momentum: 0.000000
2023-10-17 09:41:02,006 epoch 2 - iter 120/304 - loss 0.17906680 - time (sec): 5.60 - samples/sec: 2167.64 - lr: 0.000029 - momentum: 0.000000
2023-10-17 09:41:03,473 epoch 2 - iter 150/304 - loss 0.16917255 - time (sec): 7.07 - samples/sec: 2164.28 - lr: 0.000028 - momentum: 0.000000
2023-10-17 09:41:04,821 epoch 2 - iter 180/304 - loss 0.16214454 - time (sec): 8.42 - samples/sec: 2170.92 - lr: 0.000028 - momentum: 0.000000
2023-10-17 09:41:06,286 epoch 2 - iter 210/304 - loss 0.15246777 - time (sec): 9.88 - samples/sec: 2153.09 - lr: 0.000028 - momentum: 0.000000
2023-10-17 09:41:07,746 epoch 2 - iter 240/304 - loss 0.14813300 - time (sec): 11.34 - samples/sec: 2151.10 - lr: 0.000027 - momentum: 0.000000
2023-10-17 09:41:09,180 epoch 2 - iter 270/304 - loss 0.14943578 - time (sec): 12.78 - samples/sec: 2162.70 - lr: 0.000027 - momentum: 0.000000
2023-10-17 09:41:10,552 epoch 2 - iter 300/304 - loss 0.14825159 - time (sec): 14.15 - samples/sec: 2170.02 - lr: 0.000027 - momentum: 0.000000
2023-10-17 09:41:10,730 ----------------------------------------------------------------------------------------------------
2023-10-17 09:41:10,730 EPOCH 2 done: loss 0.1473 - lr: 0.000027
2023-10-17 09:41:11,707 DEV : loss 0.15535008907318115 - f1-score (micro avg) 0.7654
2023-10-17 09:41:11,716 saving best model
2023-10-17 09:41:12,173 ----------------------------------------------------------------------------------------------------
2023-10-17 09:41:13,559 epoch 3 - iter 30/304 - loss 0.09876950 - time (sec): 1.38 - samples/sec: 2285.48 - lr: 0.000026 - momentum: 0.000000
2023-10-17 09:41:14,972 epoch 3 - iter 60/304 - loss 0.08919063 - time (sec): 2.80 - samples/sec: 2147.48 - lr: 0.000026 - momentum: 0.000000
2023-10-17 09:41:16,416 epoch 3 - iter 90/304 - loss 0.08479748 - time (sec): 4.24 - samples/sec: 2115.18 - lr: 0.000026 - momentum: 0.000000
2023-10-17 09:41:17,786 epoch 3 - iter 120/304 - loss 0.08416981 - time (sec): 5.61 - samples/sec: 2126.93 - lr: 0.000025 - momentum: 0.000000
2023-10-17 09:41:19,251 epoch 3 - iter 150/304 - loss 0.08841609 - time (sec): 7.07 - samples/sec: 2114.10 - lr: 0.000025 - momentum: 0.000000
2023-10-17 09:41:20,603 epoch 3 - iter 180/304 - loss 0.08959002 - time (sec): 8.43 - samples/sec: 2140.67 - lr: 0.000025 - momentum: 0.000000
2023-10-17 09:41:21,969 epoch 3 - iter 210/304 - loss 0.08360212 - time (sec): 9.79 - samples/sec: 2164.05 - lr: 0.000024 - momentum: 0.000000
2023-10-17 09:41:23,364 epoch 3 - iter 240/304 - loss 0.08313582 - time (sec): 11.19 - samples/sec: 2200.52 - lr: 0.000024 - momentum: 0.000000
2023-10-17 09:41:24,716 epoch 3 - iter 270/304 - loss 0.07903175 - time (sec): 12.54 - samples/sec: 2212.99 - lr: 0.000024 - momentum: 0.000000
2023-10-17 09:41:26,117 epoch 3 - iter 300/304 - loss 0.08135561 - time (sec): 13.94 - samples/sec: 2202.04 - lr: 0.000023 - momentum: 0.000000
2023-10-17 09:41:26,295 ----------------------------------------------------------------------------------------------------
2023-10-17 09:41:26,295 EPOCH 3 done: loss 0.0825 - lr: 0.000023
2023-10-17 09:41:27,316 DEV : loss 0.15102672576904297 - f1-score (micro avg) 0.8162
2023-10-17 09:41:27,326 saving best model
2023-10-17 09:41:27,813 ----------------------------------------------------------------------------------------------------
2023-10-17 09:41:29,243 epoch 4 - iter 30/304 - loss 0.03313024 - time (sec): 1.43 - samples/sec: 1954.02 - lr: 0.000023 - momentum: 0.000000
2023-10-17 09:41:30,597 epoch 4 - iter 60/304 - loss 0.05504111 - time (sec): 2.78 - samples/sec: 1993.06 - lr: 0.000023 - momentum: 0.000000
2023-10-17 09:41:31,945 epoch 4 - iter 90/304 - loss 0.06404305 - time (sec): 4.13 - samples/sec: 2039.54 - lr: 0.000022 - momentum: 0.000000
2023-10-17 09:41:33,306 epoch 4 - iter 120/304 - loss 0.05743194 - time (sec): 5.49 - samples/sec: 2067.00 - lr: 0.000022 - momentum: 0.000000
2023-10-17 09:41:34,696 epoch 4 - iter 150/304 - loss 0.05944006 - time (sec): 6.88 - samples/sec: 2108.15 - lr: 0.000022 - momentum: 0.000000
2023-10-17 09:41:36,115 epoch 4 - iter 180/304 - loss 0.06091412 - time (sec): 8.30 - samples/sec: 2143.11 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:41:37,453 epoch 4 - iter 210/304 - loss 0.06464068 - time (sec): 9.64 - samples/sec: 2167.63 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:41:38,782 epoch 4 - iter 240/304 - loss 0.06497940 - time (sec): 10.97 - samples/sec: 2192.80 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:41:40,119 epoch 4 - iter 270/304 - loss 0.06477435 - time (sec): 12.30 - samples/sec: 2218.30 - lr: 0.000020 - momentum: 0.000000
2023-10-17 09:41:41,492 epoch 4 - iter 300/304 - loss 0.06200751 - time (sec): 13.68 - samples/sec: 2236.85 - lr: 0.000020 - momentum: 0.000000
2023-10-17 09:41:41,687 ----------------------------------------------------------------------------------------------------
2023-10-17 09:41:41,687 EPOCH 4 done: loss 0.0626 - lr: 0.000020
2023-10-17 09:41:42,641 DEV : loss 0.1982080489397049 - f1-score (micro avg) 0.853
2023-10-17 09:41:42,649 saving best model
2023-10-17 09:41:43,137 ----------------------------------------------------------------------------------------------------
2023-10-17 09:41:44,473 epoch 5 - iter 30/304 - loss 0.02592012 - time (sec): 1.33 - samples/sec: 2385.92 - lr: 0.000020 - momentum: 0.000000
2023-10-17 09:41:45,801 epoch 5 - iter 60/304 - loss 0.04868729 - time (sec): 2.66 - samples/sec: 2231.74 - lr: 0.000019 - momentum: 0.000000
2023-10-17 09:41:47,152 epoch 5 - iter 90/304 - loss 0.04874142 - time (sec): 4.01 - samples/sec: 2241.31 - lr: 0.000019 - momentum: 0.000000
2023-10-17 09:41:48,527 epoch 5 - iter 120/304 - loss 0.05426644 - time (sec): 5.38 - samples/sec: 2203.02 - lr: 0.000019 - momentum: 0.000000
2023-10-17 09:41:49,849 epoch 5 - iter 150/304 - loss 0.05072023 - time (sec): 6.71 - samples/sec: 2193.33 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:41:51,190 epoch 5 - iter 180/304 - loss 0.04758743 - time (sec): 8.05 - samples/sec: 2263.13 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:41:52,577 epoch 5 - iter 210/304 - loss 0.04301930 - time (sec): 9.43 - samples/sec: 2282.60 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:41:53,927 epoch 5 - iter 240/304 - loss 0.04057396 - time (sec): 10.78 - samples/sec: 2274.76 - lr: 0.000017 - momentum: 0.000000
2023-10-17 09:41:55,260 epoch 5 - iter 270/304 - loss 0.04417344 - time (sec): 12.12 - samples/sec: 2266.18 - lr: 0.000017 - momentum: 0.000000
2023-10-17 09:41:56,631 epoch 5 - iter 300/304 - loss 0.04586453 - time (sec): 13.49 - samples/sec: 2269.48 - lr: 0.000017 - momentum: 0.000000
2023-10-17 09:41:56,810 ----------------------------------------------------------------------------------------------------
2023-10-17 09:41:56,810 EPOCH 5 done: loss 0.0458 - lr: 0.000017
2023-10-17 09:41:57,802 DEV : loss 0.19336670637130737 - f1-score (micro avg) 0.8592
2023-10-17 09:41:57,811 saving best model
2023-10-17 09:41:58,356 ----------------------------------------------------------------------------------------------------
2023-10-17 09:41:59,948 epoch 6 - iter 30/304 - loss 0.01221848 - time (sec): 1.59 - samples/sec: 2075.28 - lr: 0.000016 - momentum: 0.000000
2023-10-17 09:42:01,580 epoch 6 - iter 60/304 - loss 0.02836975 - time (sec): 3.22 - samples/sec: 2062.77 - lr: 0.000016 - momentum: 0.000000
2023-10-17 09:42:03,169 epoch 6 - iter 90/304 - loss 0.02441422 - time (sec): 4.81 - samples/sec: 2022.26 - lr: 0.000016 - momentum: 0.000000
2023-10-17 09:42:04,789 epoch 6 - iter 120/304 - loss 0.01997158 - time (sec): 6.43 - samples/sec: 1989.89 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:42:06,367 epoch 6 - iter 150/304 - loss 0.02226174 - time (sec): 8.01 - samples/sec: 1976.24 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:42:07,771 epoch 6 - iter 180/304 - loss 0.03289946 - time (sec): 9.41 - samples/sec: 1970.20 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:42:09,129 epoch 6 - iter 210/304 - loss 0.03414661 - time (sec): 10.77 - samples/sec: 1983.66 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:42:10,498 epoch 6 - iter 240/304 - loss 0.03480877 - time (sec): 12.14 - samples/sec: 2008.11 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:42:11,866 epoch 6 - iter 270/304 - loss 0.03370047 - time (sec): 13.51 - samples/sec: 2030.09 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:42:13,213 epoch 6 - iter 300/304 - loss 0.03260773 - time (sec): 14.85 - samples/sec: 2062.93 - lr: 0.000013 - momentum: 0.000000
2023-10-17 09:42:13,393 ----------------------------------------------------------------------------------------------------
2023-10-17 09:42:13,393 EPOCH 6 done: loss 0.0323 - lr: 0.000013
2023-10-17 09:42:14,350 DEV : loss 0.20636732876300812 - f1-score (micro avg) 0.864
2023-10-17 09:42:14,363 saving best model
2023-10-17 09:42:14,889 ----------------------------------------------------------------------------------------------------
2023-10-17 09:42:16,291 epoch 7 - iter 30/304 - loss 0.03710421 - time (sec): 1.39 - samples/sec: 2059.99 - lr: 0.000013 - momentum: 0.000000
2023-10-17 09:42:17,714 epoch 7 - iter 60/304 - loss 0.03619170 - time (sec): 2.81 - samples/sec: 2069.57 - lr: 0.000013 - momentum: 0.000000
2023-10-17 09:42:19,250 epoch 7 - iter 90/304 - loss 0.03524165 - time (sec): 4.35 - samples/sec: 2081.46 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:42:20,676 epoch 7 - iter 120/304 - loss 0.03009571 - time (sec): 5.77 - samples/sec: 2080.39 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:42:22,138 epoch 7 - iter 150/304 - loss 0.02823354 - time (sec): 7.24 - samples/sec: 2083.35 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:42:23,580 epoch 7 - iter 180/304 - loss 0.02821860 - time (sec): 8.68 - samples/sec: 2057.68 - lr: 0.000011 - momentum: 0.000000
2023-10-17 09:42:25,034 epoch 7 - iter 210/304 - loss 0.02646873 - time (sec): 10.13 - samples/sec: 2064.79 - lr: 0.000011 - momentum: 0.000000
2023-10-17 09:42:26,453 epoch 7 - iter 240/304 - loss 0.02543135 - time (sec): 11.55 - samples/sec: 2084.44 - lr: 0.000011 - momentum: 0.000000
2023-10-17 09:42:27,915 epoch 7 - iter 270/304 - loss 0.02847803 - time (sec): 13.01 - samples/sec: 2099.39 - lr: 0.000010 - momentum: 0.000000
2023-10-17 09:42:29,300 epoch 7 - iter 300/304 - loss 0.02760448 - time (sec): 14.40 - samples/sec: 2129.01 - lr: 0.000010 - momentum: 0.000000
2023-10-17 09:42:29,493 ----------------------------------------------------------------------------------------------------
2023-10-17 09:42:29,494 EPOCH 7 done: loss 0.0273 - lr: 0.000010
2023-10-17 09:42:30,447 DEV : loss 0.20442555844783783 - f1-score (micro avg) 0.844
2023-10-17 09:42:30,454 ----------------------------------------------------------------------------------------------------
2023-10-17 09:42:31,792 epoch 8 - iter 30/304 - loss 0.03331394 - time (sec): 1.34 - samples/sec: 2206.88 - lr: 0.000010 - momentum: 0.000000
2023-10-17 09:42:33,128 epoch 8 - iter 60/304 - loss 0.01684349 - time (sec): 2.67 - samples/sec: 2228.46 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:42:34,480 epoch 8 - iter 90/304 - loss 0.01445152 - time (sec): 4.02 - samples/sec: 2291.00 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:42:35,829 epoch 8 - iter 120/304 - loss 0.01339351 - time (sec): 5.37 - samples/sec: 2256.72 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:42:37,179 epoch 8 - iter 150/304 - loss 0.02202332 - time (sec): 6.72 - samples/sec: 2246.82 - lr: 0.000008 - momentum: 0.000000
2023-10-17 09:42:38,543 epoch 8 - iter 180/304 - loss 0.01918384 - time (sec): 8.09 - samples/sec: 2269.07 - lr: 0.000008 - momentum: 0.000000
2023-10-17 09:42:39,869 epoch 8 - iter 210/304 - loss 0.01845123 - time (sec): 9.41 - samples/sec: 2257.43 - lr: 0.000008 - momentum: 0.000000
2023-10-17 09:42:41,238 epoch 8 - iter 240/304 - loss 0.02065770 - time (sec): 10.78 - samples/sec: 2282.79 - lr: 0.000007 - momentum: 0.000000
2023-10-17 09:42:42,601 epoch 8 - iter 270/304 - loss 0.02081829 - time (sec): 12.15 - samples/sec: 2274.37 - lr: 0.000007 - momentum: 0.000000
2023-10-17 09:42:43,940 epoch 8 - iter 300/304 - loss 0.02159129 - time (sec): 13.49 - samples/sec: 2272.19 - lr: 0.000007 - momentum: 0.000000
2023-10-17 09:42:44,118 ----------------------------------------------------------------------------------------------------
2023-10-17 09:42:44,118 EPOCH 8 done: loss 0.0218 - lr: 0.000007
2023-10-17 09:42:45,142 DEV : loss 0.21142761409282684 - f1-score (micro avg) 0.8647
2023-10-17 09:42:45,150 saving best model
2023-10-17 09:42:45,635 ----------------------------------------------------------------------------------------------------
2023-10-17 09:42:47,017 epoch 9 - iter 30/304 - loss 0.00896017 - time (sec): 1.38 - samples/sec: 2060.55 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:42:48,418 epoch 9 - iter 60/304 - loss 0.01136709 - time (sec): 2.78 - samples/sec: 2136.14 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:42:49,776 epoch 9 - iter 90/304 - loss 0.01848839 - time (sec): 4.14 - samples/sec: 2168.92 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:42:51,111 epoch 9 - iter 120/304 - loss 0.01553642 - time (sec): 5.47 - samples/sec: 2237.45 - lr: 0.000005 - momentum: 0.000000
2023-10-17 09:42:52,448 epoch 9 - iter 150/304 - loss 0.01360062 - time (sec): 6.81 - samples/sec: 2241.17 - lr: 0.000005 - momentum: 0.000000
2023-10-17 09:42:53,795 epoch 9 - iter 180/304 - loss 0.01608852 - time (sec): 8.16 - samples/sec: 2233.62 - lr: 0.000005 - momentum: 0.000000
2023-10-17 09:42:55,125 epoch 9 - iter 210/304 - loss 0.01555607 - time (sec): 9.49 - samples/sec: 2219.00 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:42:56,493 epoch 9 - iter 240/304 - loss 0.01405601 - time (sec): 10.86 - samples/sec: 2208.17 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:42:57,909 epoch 9 - iter 270/304 - loss 0.01755853 - time (sec): 12.27 - samples/sec: 2235.31 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:42:59,338 epoch 9 - iter 300/304 - loss 0.01701838 - time (sec): 13.70 - samples/sec: 2226.40 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:42:59,518 ----------------------------------------------------------------------------------------------------
2023-10-17 09:42:59,519 EPOCH 9 done: loss 0.0171 - lr: 0.000003
2023-10-17 09:43:00,556 DEV : loss 0.21606358885765076 - f1-score (micro avg) 0.8657
2023-10-17 09:43:00,564 saving best model
2023-10-17 09:43:01,075 ----------------------------------------------------------------------------------------------------
2023-10-17 09:43:02,399 epoch 10 - iter 30/304 - loss 0.01079333 - time (sec): 1.32 - samples/sec: 2282.69 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:43:03,827 epoch 10 - iter 60/304 - loss 0.00535025 - time (sec): 2.75 - samples/sec: 2240.61 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:43:05,182 epoch 10 - iter 90/304 - loss 0.01128766 - time (sec): 4.10 - samples/sec: 2280.98 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:43:06,536 epoch 10 - iter 120/304 - loss 0.01589802 - time (sec): 5.46 - samples/sec: 2236.46 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:43:07,903 epoch 10 - iter 150/304 - loss 0.01443741 - time (sec): 6.82 - samples/sec: 2250.57 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:43:09,317 epoch 10 - iter 180/304 - loss 0.01637260 - time (sec): 8.24 - samples/sec: 2213.08 - lr: 0.000001 - momentum: 0.000000
2023-10-17 09:43:10,694 epoch 10 - iter 210/304 - loss 0.01591658 - time (sec): 9.62 - samples/sec: 2216.92 - lr: 0.000001 - momentum: 0.000000
2023-10-17 09:43:12,129 epoch 10 - iter 240/304 - loss 0.01523204 - time (sec): 11.05 - samples/sec: 2219.38 - lr: 0.000001 - momentum: 0.000000
2023-10-17 09:43:13,453 epoch 10 - iter 270/304 - loss 0.01368850 - time (sec): 12.37 - samples/sec: 2208.39 - lr: 0.000000 - momentum: 0.000000
2023-10-17 09:43:14,832 epoch 10 - iter 300/304 - loss 0.01292642 - time (sec): 13.75 - samples/sec: 2223.75 - lr: 0.000000 - momentum: 0.000000
2023-10-17 09:43:15,004 ----------------------------------------------------------------------------------------------------
2023-10-17 09:43:15,004 EPOCH 10 done: loss 0.0128 - lr: 0.000000
2023-10-17 09:43:16,004 DEV : loss 0.21460825204849243 - f1-score (micro avg) 0.8647
2023-10-17 09:43:16,378 ----------------------------------------------------------------------------------------------------
2023-10-17 09:43:16,379 Loading model from best epoch ...
2023-10-17 09:43:17,813 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object
2023-10-17 09:43:18,725
Results:
- F-score (micro) 0.8263
- F-score (macro) 0.6655
- Accuracy 0.7091
By class:
precision recall f1-score support
scope 0.7756 0.8013 0.7883 151
work 0.7757 0.8737 0.8218 95
pers 0.9082 0.9271 0.9175 96
date 0.0000 0.0000 0.0000 3
loc 1.0000 0.6667 0.8000 3
micro avg 0.8060 0.8477 0.8263 348
macro avg 0.6919 0.6538 0.6655 348
weighted avg 0.8075 0.8477 0.8264 348
2023-10-17 09:43:18,725 ----------------------------------------------------------------------------------------------------