|
2023-10-18 15:59:38,444 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:38,444 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=25, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 15:59:38,444 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:38,444 MultiCorpus: 1214 train + 266 dev + 251 test sentences |
|
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator |
|
2023-10-18 15:59:38,444 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:38,444 Train: 1214 sentences |
|
2023-10-18 15:59:38,444 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 15:59:38,444 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:38,445 Training Params: |
|
2023-10-18 15:59:38,445 - learning_rate: "5e-05" |
|
2023-10-18 15:59:38,445 - mini_batch_size: "4" |
|
2023-10-18 15:59:38,445 - max_epochs: "10" |
|
2023-10-18 15:59:38,445 - shuffle: "True" |
|
2023-10-18 15:59:38,445 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:38,445 Plugins: |
|
2023-10-18 15:59:38,445 - TensorboardLogger |
|
2023-10-18 15:59:38,445 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 15:59:38,445 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:38,445 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 15:59:38,445 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 15:59:38,445 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:38,445 Computation: |
|
2023-10-18 15:59:38,445 - compute on device: cuda:0 |
|
2023-10-18 15:59:38,445 - embedding storage: none |
|
2023-10-18 15:59:38,445 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:38,445 Model training base path: "hmbench-ajmc/en-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-18 15:59:38,445 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:38,445 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:38,445 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 15:59:38,893 epoch 1 - iter 30/304 - loss 4.02439859 - time (sec): 0.45 - samples/sec: 6908.01 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 15:59:39,332 epoch 1 - iter 60/304 - loss 3.94799824 - time (sec): 0.89 - samples/sec: 6775.15 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 15:59:39,798 epoch 1 - iter 90/304 - loss 3.77027124 - time (sec): 1.35 - samples/sec: 6645.20 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 15:59:40,240 epoch 1 - iter 120/304 - loss 3.52488959 - time (sec): 1.79 - samples/sec: 6631.52 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 15:59:40,680 epoch 1 - iter 150/304 - loss 3.22781102 - time (sec): 2.23 - samples/sec: 6620.98 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 15:59:41,127 epoch 1 - iter 180/304 - loss 2.92761720 - time (sec): 2.68 - samples/sec: 6552.60 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 15:59:41,584 epoch 1 - iter 210/304 - loss 2.61647495 - time (sec): 3.14 - samples/sec: 6656.08 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 15:59:42,044 epoch 1 - iter 240/304 - loss 2.38351652 - time (sec): 3.60 - samples/sec: 6683.60 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 15:59:42,493 epoch 1 - iter 270/304 - loss 2.18072309 - time (sec): 4.05 - samples/sec: 6762.41 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 15:59:42,947 epoch 1 - iter 300/304 - loss 2.03065121 - time (sec): 4.50 - samples/sec: 6818.46 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 15:59:43,004 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:43,004 EPOCH 1 done: loss 2.0164 - lr: 0.000049 |
|
2023-10-18 15:59:43,462 DEV : loss 0.7537270188331604 - f1-score (micro avg) 0.0 |
|
2023-10-18 15:59:43,467 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:43,917 epoch 2 - iter 30/304 - loss 0.67693457 - time (sec): 0.45 - samples/sec: 7263.16 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 15:59:44,381 epoch 2 - iter 60/304 - loss 0.68249441 - time (sec): 0.91 - samples/sec: 6828.38 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 15:59:44,855 epoch 2 - iter 90/304 - loss 0.68860219 - time (sec): 1.39 - samples/sec: 6743.02 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 15:59:45,375 epoch 2 - iter 120/304 - loss 0.68478765 - time (sec): 1.91 - samples/sec: 6636.08 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 15:59:45,842 epoch 2 - iter 150/304 - loss 0.64870052 - time (sec): 2.37 - samples/sec: 6488.58 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 15:59:46,320 epoch 2 - iter 180/304 - loss 0.64138183 - time (sec): 2.85 - samples/sec: 6412.79 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 15:59:46,774 epoch 2 - iter 210/304 - loss 0.60885352 - time (sec): 3.31 - samples/sec: 6538.11 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 15:59:47,232 epoch 2 - iter 240/304 - loss 0.59175345 - time (sec): 3.76 - samples/sec: 6515.91 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 15:59:47,681 epoch 2 - iter 270/304 - loss 0.58324719 - time (sec): 4.21 - samples/sec: 6578.92 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 15:59:48,135 epoch 2 - iter 300/304 - loss 0.58565562 - time (sec): 4.67 - samples/sec: 6569.76 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 15:59:48,196 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:48,196 EPOCH 2 done: loss 0.5838 - lr: 0.000045 |
|
2023-10-18 15:59:48,694 DEV : loss 0.39393150806427 - f1-score (micro avg) 0.2408 |
|
2023-10-18 15:59:48,699 saving best model |
|
2023-10-18 15:59:48,728 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:49,181 epoch 3 - iter 30/304 - loss 0.42728775 - time (sec): 0.45 - samples/sec: 6484.68 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 15:59:49,667 epoch 3 - iter 60/304 - loss 0.43840468 - time (sec): 0.94 - samples/sec: 6754.01 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 15:59:50,138 epoch 3 - iter 90/304 - loss 0.43340952 - time (sec): 1.41 - samples/sec: 6653.01 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 15:59:50,622 epoch 3 - iter 120/304 - loss 0.42193644 - time (sec): 1.89 - samples/sec: 6790.83 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 15:59:51,123 epoch 3 - iter 150/304 - loss 0.41461292 - time (sec): 2.39 - samples/sec: 6665.34 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 15:59:51,586 epoch 3 - iter 180/304 - loss 0.42630692 - time (sec): 2.86 - samples/sec: 6629.85 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 15:59:52,043 epoch 3 - iter 210/304 - loss 0.42817892 - time (sec): 3.31 - samples/sec: 6581.34 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 15:59:52,481 epoch 3 - iter 240/304 - loss 0.41490933 - time (sec): 3.75 - samples/sec: 6564.30 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 15:59:52,929 epoch 3 - iter 270/304 - loss 0.41477427 - time (sec): 4.20 - samples/sec: 6579.62 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 15:59:53,383 epoch 3 - iter 300/304 - loss 0.40965081 - time (sec): 4.65 - samples/sec: 6576.98 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 15:59:53,440 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:53,440 EPOCH 3 done: loss 0.4082 - lr: 0.000039 |
|
2023-10-18 15:59:53,951 DEV : loss 0.32593443989753723 - f1-score (micro avg) 0.4345 |
|
2023-10-18 15:59:53,957 saving best model |
|
2023-10-18 15:59:53,996 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:54,450 epoch 4 - iter 30/304 - loss 0.34255467 - time (sec): 0.45 - samples/sec: 6790.88 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 15:59:54,894 epoch 4 - iter 60/304 - loss 0.39274816 - time (sec): 0.90 - samples/sec: 6877.07 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 15:59:55,348 epoch 4 - iter 90/304 - loss 0.37778060 - time (sec): 1.35 - samples/sec: 6903.74 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 15:59:55,815 epoch 4 - iter 120/304 - loss 0.37593215 - time (sec): 1.82 - samples/sec: 6776.34 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 15:59:56,274 epoch 4 - iter 150/304 - loss 0.36040013 - time (sec): 2.28 - samples/sec: 6707.14 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 15:59:56,717 epoch 4 - iter 180/304 - loss 0.35667086 - time (sec): 2.72 - samples/sec: 6684.37 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 15:59:57,173 epoch 4 - iter 210/304 - loss 0.35194304 - time (sec): 3.18 - samples/sec: 6724.60 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 15:59:57,624 epoch 4 - iter 240/304 - loss 0.34062892 - time (sec): 3.63 - samples/sec: 6771.82 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 15:59:58,077 epoch 4 - iter 270/304 - loss 0.33268323 - time (sec): 4.08 - samples/sec: 6742.26 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 15:59:58,528 epoch 4 - iter 300/304 - loss 0.33151212 - time (sec): 4.53 - samples/sec: 6759.21 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 15:59:58,582 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:58,582 EPOCH 4 done: loss 0.3309 - lr: 0.000033 |
|
2023-10-18 15:59:59,095 DEV : loss 0.29030725359916687 - f1-score (micro avg) 0.4636 |
|
2023-10-18 15:59:59,101 saving best model |
|
2023-10-18 15:59:59,136 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:59,589 epoch 5 - iter 30/304 - loss 0.30069451 - time (sec): 0.45 - samples/sec: 6982.39 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 16:00:00,045 epoch 5 - iter 60/304 - loss 0.32261811 - time (sec): 0.91 - samples/sec: 7065.54 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 16:00:00,500 epoch 5 - iter 90/304 - loss 0.29691347 - time (sec): 1.36 - samples/sec: 7070.53 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 16:00:00,962 epoch 5 - iter 120/304 - loss 0.28228503 - time (sec): 1.83 - samples/sec: 6873.42 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 16:00:01,424 epoch 5 - iter 150/304 - loss 0.27506302 - time (sec): 2.29 - samples/sec: 6831.76 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 16:00:01,878 epoch 5 - iter 180/304 - loss 0.27966818 - time (sec): 2.74 - samples/sec: 6748.00 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 16:00:02,329 epoch 5 - iter 210/304 - loss 0.28859803 - time (sec): 3.19 - samples/sec: 6813.04 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 16:00:02,786 epoch 5 - iter 240/304 - loss 0.29244738 - time (sec): 3.65 - samples/sec: 6743.30 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 16:00:03,233 epoch 5 - iter 270/304 - loss 0.28822906 - time (sec): 4.10 - samples/sec: 6762.72 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 16:00:03,690 epoch 5 - iter 300/304 - loss 0.29090928 - time (sec): 4.55 - samples/sec: 6751.37 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 16:00:03,744 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:00:03,744 EPOCH 5 done: loss 0.2904 - lr: 0.000028 |
|
2023-10-18 16:00:04,278 DEV : loss 0.26164156198501587 - f1-score (micro avg) 0.5045 |
|
2023-10-18 16:00:04,283 saving best model |
|
2023-10-18 16:00:04,319 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:00:04,772 epoch 6 - iter 30/304 - loss 0.28626393 - time (sec): 0.45 - samples/sec: 6235.37 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 16:00:05,211 epoch 6 - iter 60/304 - loss 0.27365340 - time (sec): 0.89 - samples/sec: 6403.24 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 16:00:05,666 epoch 6 - iter 90/304 - loss 0.26753734 - time (sec): 1.35 - samples/sec: 6724.20 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 16:00:06,122 epoch 6 - iter 120/304 - loss 0.28549914 - time (sec): 1.80 - samples/sec: 6700.80 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 16:00:06,578 epoch 6 - iter 150/304 - loss 0.27259411 - time (sec): 2.26 - samples/sec: 6743.60 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 16:00:07,035 epoch 6 - iter 180/304 - loss 0.27162046 - time (sec): 2.72 - samples/sec: 6713.65 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 16:00:07,499 epoch 6 - iter 210/304 - loss 0.26916890 - time (sec): 3.18 - samples/sec: 6705.00 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 16:00:07,951 epoch 6 - iter 240/304 - loss 0.26300420 - time (sec): 3.63 - samples/sec: 6679.51 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 16:00:08,408 epoch 6 - iter 270/304 - loss 0.26565702 - time (sec): 4.09 - samples/sec: 6643.81 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 16:00:08,886 epoch 6 - iter 300/304 - loss 0.26199835 - time (sec): 4.57 - samples/sec: 6699.47 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 16:00:08,945 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:00:08,945 EPOCH 6 done: loss 0.2617 - lr: 0.000022 |
|
2023-10-18 16:00:09,454 DEV : loss 0.25211378931999207 - f1-score (micro avg) 0.5287 |
|
2023-10-18 16:00:09,459 saving best model |
|
2023-10-18 16:00:09,494 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:00:09,951 epoch 7 - iter 30/304 - loss 0.24943728 - time (sec): 0.46 - samples/sec: 6462.10 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 16:00:10,408 epoch 7 - iter 60/304 - loss 0.25514322 - time (sec): 0.91 - samples/sec: 6606.04 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 16:00:10,865 epoch 7 - iter 90/304 - loss 0.25104880 - time (sec): 1.37 - samples/sec: 6578.90 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 16:00:11,310 epoch 7 - iter 120/304 - loss 0.25691959 - time (sec): 1.81 - samples/sec: 6662.24 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 16:00:11,761 epoch 7 - iter 150/304 - loss 0.25712263 - time (sec): 2.27 - samples/sec: 6781.94 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 16:00:12,209 epoch 7 - iter 180/304 - loss 0.25996239 - time (sec): 2.71 - samples/sec: 6747.99 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 16:00:12,659 epoch 7 - iter 210/304 - loss 0.25169702 - time (sec): 3.16 - samples/sec: 6721.71 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 16:00:13,115 epoch 7 - iter 240/304 - loss 0.25021440 - time (sec): 3.62 - samples/sec: 6715.13 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 16:00:13,581 epoch 7 - iter 270/304 - loss 0.24382932 - time (sec): 4.09 - samples/sec: 6722.19 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 16:00:14,043 epoch 7 - iter 300/304 - loss 0.24830003 - time (sec): 4.55 - samples/sec: 6742.78 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 16:00:14,107 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:00:14,107 EPOCH 7 done: loss 0.2474 - lr: 0.000017 |
|
2023-10-18 16:00:14,609 DEV : loss 0.2437625378370285 - f1-score (micro avg) 0.5487 |
|
2023-10-18 16:00:14,614 saving best model |
|
2023-10-18 16:00:14,649 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:00:15,121 epoch 8 - iter 30/304 - loss 0.22448797 - time (sec): 0.47 - samples/sec: 5622.44 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 16:00:15,595 epoch 8 - iter 60/304 - loss 0.23152462 - time (sec): 0.95 - samples/sec: 6010.92 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 16:00:16,068 epoch 8 - iter 90/304 - loss 0.21243463 - time (sec): 1.42 - samples/sec: 6088.48 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 16:00:16,566 epoch 8 - iter 120/304 - loss 0.23068737 - time (sec): 1.92 - samples/sec: 6156.13 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 16:00:17,042 epoch 8 - iter 150/304 - loss 0.23114576 - time (sec): 2.39 - samples/sec: 6336.38 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 16:00:17,499 epoch 8 - iter 180/304 - loss 0.23233799 - time (sec): 2.85 - samples/sec: 6309.95 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 16:00:17,942 epoch 8 - iter 210/304 - loss 0.23090640 - time (sec): 3.29 - samples/sec: 6464.13 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 16:00:18,388 epoch 8 - iter 240/304 - loss 0.23271789 - time (sec): 3.74 - samples/sec: 6553.42 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 16:00:18,864 epoch 8 - iter 270/304 - loss 0.23726179 - time (sec): 4.21 - samples/sec: 6553.74 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 16:00:19,319 epoch 8 - iter 300/304 - loss 0.23601616 - time (sec): 4.67 - samples/sec: 6567.33 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 16:00:19,375 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:00:19,376 EPOCH 8 done: loss 0.2358 - lr: 0.000011 |
|
2023-10-18 16:00:19,894 DEV : loss 0.23939570784568787 - f1-score (micro avg) 0.5522 |
|
2023-10-18 16:00:19,899 saving best model |
|
2023-10-18 16:00:19,933 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:00:20,398 epoch 9 - iter 30/304 - loss 0.21473866 - time (sec): 0.46 - samples/sec: 6421.88 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 16:00:20,878 epoch 9 - iter 60/304 - loss 0.24345455 - time (sec): 0.94 - samples/sec: 6481.83 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 16:00:21,342 epoch 9 - iter 90/304 - loss 0.25235638 - time (sec): 1.41 - samples/sec: 6575.74 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 16:00:21,791 epoch 9 - iter 120/304 - loss 0.23860452 - time (sec): 1.86 - samples/sec: 6701.57 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 16:00:22,252 epoch 9 - iter 150/304 - loss 0.23208699 - time (sec): 2.32 - samples/sec: 6708.59 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 16:00:22,695 epoch 9 - iter 180/304 - loss 0.23166946 - time (sec): 2.76 - samples/sec: 6639.55 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 16:00:23,155 epoch 9 - iter 210/304 - loss 0.23227634 - time (sec): 3.22 - samples/sec: 6721.93 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 16:00:23,602 epoch 9 - iter 240/304 - loss 0.23524665 - time (sec): 3.67 - samples/sec: 6681.01 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 16:00:24,048 epoch 9 - iter 270/304 - loss 0.23400306 - time (sec): 4.11 - samples/sec: 6646.96 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 16:00:24,505 epoch 9 - iter 300/304 - loss 0.23000825 - time (sec): 4.57 - samples/sec: 6696.02 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 16:00:24,561 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:00:24,561 EPOCH 9 done: loss 0.2303 - lr: 0.000006 |
|
2023-10-18 16:00:25,084 DEV : loss 0.2380855232477188 - f1-score (micro avg) 0.5534 |
|
2023-10-18 16:00:25,090 saving best model |
|
2023-10-18 16:00:25,123 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:00:25,583 epoch 10 - iter 30/304 - loss 0.19168195 - time (sec): 0.46 - samples/sec: 6711.13 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 16:00:26,055 epoch 10 - iter 60/304 - loss 0.17965363 - time (sec): 0.93 - samples/sec: 6704.56 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 16:00:26,495 epoch 10 - iter 90/304 - loss 0.20505631 - time (sec): 1.37 - samples/sec: 6766.85 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 16:00:26,934 epoch 10 - iter 120/304 - loss 0.20380845 - time (sec): 1.81 - samples/sec: 6660.94 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 16:00:27,377 epoch 10 - iter 150/304 - loss 0.20047948 - time (sec): 2.25 - samples/sec: 6623.80 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 16:00:27,828 epoch 10 - iter 180/304 - loss 0.21028851 - time (sec): 2.70 - samples/sec: 6690.30 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 16:00:28,295 epoch 10 - iter 210/304 - loss 0.21399489 - time (sec): 3.17 - samples/sec: 6675.00 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 16:00:28,759 epoch 10 - iter 240/304 - loss 0.22201720 - time (sec): 3.64 - samples/sec: 6667.29 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 16:00:29,232 epoch 10 - iter 270/304 - loss 0.22425873 - time (sec): 4.11 - samples/sec: 6634.53 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 16:00:29,693 epoch 10 - iter 300/304 - loss 0.22507873 - time (sec): 4.57 - samples/sec: 6694.32 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 16:00:29,753 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:00:29,753 EPOCH 10 done: loss 0.2235 - lr: 0.000000 |
|
2023-10-18 16:00:30,275 DEV : loss 0.23677615821361542 - f1-score (micro avg) 0.5579 |
|
2023-10-18 16:00:30,281 saving best model |
|
2023-10-18 16:00:30,340 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:00:30,340 Loading model from best epoch ... |
|
2023-10-18 16:00:30,417 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object |
|
2023-10-18 16:00:30,907 |
|
Results: |
|
- F-score (micro) 0.5992 |
|
- F-score (macro) 0.3691 |
|
- Accuracy 0.4524 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
scope 0.5082 0.6159 0.5569 151 |
|
work 0.5175 0.7789 0.6218 95 |
|
pers 0.7011 0.6354 0.6667 96 |
|
loc 0.0000 0.0000 0.0000 3 |
|
date 0.0000 0.0000 0.0000 3 |
|
|
|
micro avg 0.5521 0.6552 0.5992 348 |
|
macro avg 0.3454 0.4061 0.3691 348 |
|
weighted avg 0.5552 0.6552 0.5953 348 |
|
|
|
2023-10-18 16:00:30,907 ---------------------------------------------------------------------------------------------------- |
|
|