2023-10-25 21:24:56,312 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:24:56,313 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 21:24:56,313 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:24:56,314 MultiCorpus: 1085 train + 148 dev + 364 test sentences - NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator 2023-10-25 21:24:56,314 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:24:56,314 Train: 1085 sentences 2023-10-25 21:24:56,314 (train_with_dev=False, train_with_test=False) 2023-10-25 21:24:56,314 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:24:56,314 Training Params: 2023-10-25 21:24:56,314 - learning_rate: "3e-05" 2023-10-25 21:24:56,314 - mini_batch_size: "4" 2023-10-25 21:24:56,314 - max_epochs: "10" 2023-10-25 21:24:56,314 - shuffle: "True" 2023-10-25 21:24:56,314 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:24:56,314 Plugins: 2023-10-25 21:24:56,314 - TensorboardLogger 2023-10-25 21:24:56,314 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 21:24:56,314 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:24:56,314 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 21:24:56,314 - metric: "('micro avg', 'f1-score')" 2023-10-25 21:24:56,314 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:24:56,314 Computation: 2023-10-25 21:24:56,314 - compute on device: cuda:0 2023-10-25 21:24:56,314 - embedding storage: none 2023-10-25 21:24:56,314 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:24:56,314 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-25 21:24:56,314 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:24:56,314 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:24:56,314 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 21:24:57,790 epoch 1 - iter 27/272 - loss 2.95478253 - time (sec): 1.47 - samples/sec: 3722.53 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:24:59,293 epoch 1 - iter 54/272 - loss 2.36462168 - time (sec): 2.98 - samples/sec: 3644.52 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:25:00,680 epoch 1 - iter 81/272 - loss 1.87243562 - time (sec): 4.36 - samples/sec: 3545.80 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:25:02,180 epoch 1 - iter 108/272 - loss 1.49436340 - time (sec): 5.86 - samples/sec: 3560.63 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:25:03,621 epoch 1 - iter 135/272 - loss 1.28098397 - time (sec): 7.31 - samples/sec: 3486.44 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:25:05,100 epoch 1 - iter 162/272 - loss 1.10859938 - time (sec): 8.78 - samples/sec: 3539.33 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:25:06,640 epoch 1 - iter 189/272 - loss 0.99184857 - time (sec): 10.33 - samples/sec: 3494.32 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:25:08,116 epoch 1 - iter 216/272 - loss 0.89139646 - time (sec): 11.80 - samples/sec: 3510.77 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:25:09,621 epoch 1 - iter 243/272 - loss 0.82167559 - time (sec): 13.31 - samples/sec: 3471.31 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:25:11,185 epoch 1 - iter 270/272 - loss 0.76133078 - time (sec): 14.87 - samples/sec: 3480.77 - lr: 0.000030 - momentum: 0.000000 2023-10-25 21:25:11,284 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:25:11,285 EPOCH 1 done: loss 0.7584 - lr: 0.000030 2023-10-25 21:25:12,457 DEV : loss 0.14451949298381805 - f1-score (micro avg) 0.6804 2023-10-25 21:25:12,463 saving best model 2023-10-25 21:25:12,965 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:25:14,476 epoch 2 - iter 27/272 - loss 0.10550064 - time (sec): 1.51 - samples/sec: 3122.04 - lr: 0.000030 - momentum: 0.000000 2023-10-25 21:25:16,011 epoch 2 - iter 54/272 - loss 0.11014367 - time (sec): 3.04 - samples/sec: 3476.83 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:25:17,480 epoch 2 - iter 81/272 - loss 0.13176071 - time (sec): 4.51 - samples/sec: 3324.86 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:25:18,967 epoch 2 - iter 108/272 - loss 0.13399368 - time (sec): 6.00 - samples/sec: 3535.90 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:25:20,463 epoch 2 - iter 135/272 - loss 0.12917079 - time (sec): 7.50 - samples/sec: 3535.18 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:25:21,972 epoch 2 - iter 162/272 - loss 0.13057126 - time (sec): 9.01 - samples/sec: 3483.46 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:25:23,459 epoch 2 - iter 189/272 - loss 0.13256525 - time (sec): 10.49 - samples/sec: 3536.97 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:25:24,953 epoch 2 - iter 216/272 - loss 0.12890332 - time (sec): 11.99 - samples/sec: 3606.43 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:25:26,348 epoch 2 - iter 243/272 - loss 0.12876898 - time (sec): 13.38 - samples/sec: 3557.32 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:25:27,704 epoch 2 - iter 270/272 - loss 0.12992069 - time (sec): 14.74 - samples/sec: 3518.57 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:25:27,801 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:25:27,801 EPOCH 2 done: loss 0.1300 - lr: 0.000027 2023-10-25 21:25:29,026 DEV : loss 0.10889776796102524 - f1-score (micro avg) 0.7868 2023-10-25 21:25:29,033 saving best model 2023-10-25 21:25:29,757 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:25:31,157 epoch 3 - iter 27/272 - loss 0.09712327 - time (sec): 1.40 - samples/sec: 3564.88 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:25:32,529 epoch 3 - iter 54/272 - loss 0.08242375 - time (sec): 2.77 - samples/sec: 3892.24 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:25:33,921 epoch 3 - iter 81/272 - loss 0.07445624 - time (sec): 4.16 - samples/sec: 3744.19 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:25:35,305 epoch 3 - iter 108/272 - loss 0.07196821 - time (sec): 5.55 - samples/sec: 3795.36 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:25:36,802 epoch 3 - iter 135/272 - loss 0.07055271 - time (sec): 7.04 - samples/sec: 3797.39 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:25:38,252 epoch 3 - iter 162/272 - loss 0.06865725 - time (sec): 8.49 - samples/sec: 3704.75 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:25:39,756 epoch 3 - iter 189/272 - loss 0.07384271 - time (sec): 10.00 - samples/sec: 3626.70 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:25:41,233 epoch 3 - iter 216/272 - loss 0.07124307 - time (sec): 11.47 - samples/sec: 3613.06 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:25:42,692 epoch 3 - iter 243/272 - loss 0.07419483 - time (sec): 12.93 - samples/sec: 3580.42 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:25:44,265 epoch 3 - iter 270/272 - loss 0.07300144 - time (sec): 14.51 - samples/sec: 3570.19 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:25:44,377 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:25:44,377 EPOCH 3 done: loss 0.0727 - lr: 0.000023 2023-10-25 21:25:45,552 DEV : loss 0.12385641783475876 - f1-score (micro avg) 0.789 2023-10-25 21:25:45,558 saving best model 2023-10-25 21:25:46,276 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:25:47,852 epoch 4 - iter 27/272 - loss 0.03286519 - time (sec): 1.57 - samples/sec: 3279.89 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:25:49,219 epoch 4 - iter 54/272 - loss 0.04511852 - time (sec): 2.94 - samples/sec: 3410.45 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:25:50,715 epoch 4 - iter 81/272 - loss 0.04150325 - time (sec): 4.44 - samples/sec: 3644.86 - lr: 0.000022 - momentum: 0.000000 2023-10-25 21:25:52,114 epoch 4 - iter 108/272 - loss 0.04650275 - time (sec): 5.84 - samples/sec: 3609.30 - lr: 0.000022 - momentum: 0.000000 2023-10-25 21:25:53,527 epoch 4 - iter 135/272 - loss 0.04301268 - time (sec): 7.25 - samples/sec: 3632.60 - lr: 0.000022 - momentum: 0.000000 2023-10-25 21:25:54,928 epoch 4 - iter 162/272 - loss 0.04424032 - time (sec): 8.65 - samples/sec: 3684.89 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:25:56,365 epoch 4 - iter 189/272 - loss 0.04374274 - time (sec): 10.09 - samples/sec: 3666.06 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:25:57,803 epoch 4 - iter 216/272 - loss 0.04162559 - time (sec): 11.52 - samples/sec: 3659.98 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:25:59,204 epoch 4 - iter 243/272 - loss 0.04400279 - time (sec): 12.93 - samples/sec: 3652.24 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:26:00,599 epoch 4 - iter 270/272 - loss 0.04402940 - time (sec): 14.32 - samples/sec: 3618.64 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:26:00,697 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:26:00,698 EPOCH 4 done: loss 0.0439 - lr: 0.000020 2023-10-25 21:26:01,832 DEV : loss 0.13214778900146484 - f1-score (micro avg) 0.8022 2023-10-25 21:26:01,838 saving best model 2023-10-25 21:26:02,546 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:26:04,042 epoch 5 - iter 27/272 - loss 0.01630458 - time (sec): 1.49 - samples/sec: 3393.60 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:26:05,505 epoch 5 - iter 54/272 - loss 0.03437239 - time (sec): 2.96 - samples/sec: 3323.21 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:26:07,020 epoch 5 - iter 81/272 - loss 0.03191448 - time (sec): 4.47 - samples/sec: 3306.77 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:26:08,585 epoch 5 - iter 108/272 - loss 0.02979947 - time (sec): 6.04 - samples/sec: 3311.14 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:26:10,374 epoch 5 - iter 135/272 - loss 0.02958984 - time (sec): 7.82 - samples/sec: 3099.23 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:26:11,974 epoch 5 - iter 162/272 - loss 0.03062299 - time (sec): 9.42 - samples/sec: 3172.34 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:26:13,507 epoch 5 - iter 189/272 - loss 0.03100487 - time (sec): 10.96 - samples/sec: 3208.04 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:26:14,971 epoch 5 - iter 216/272 - loss 0.02954207 - time (sec): 12.42 - samples/sec: 3235.39 - lr: 0.000017 - momentum: 0.000000 2023-10-25 21:26:16,462 epoch 5 - iter 243/272 - loss 0.03056086 - time (sec): 13.91 - samples/sec: 3321.39 - lr: 0.000017 - momentum: 0.000000 2023-10-25 21:26:17,920 epoch 5 - iter 270/272 - loss 0.03180794 - time (sec): 15.37 - samples/sec: 3371.03 - lr: 0.000017 - momentum: 0.000000 2023-10-25 21:26:18,023 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:26:18,023 EPOCH 5 done: loss 0.0320 - lr: 0.000017 2023-10-25 21:26:19,214 DEV : loss 0.1537138819694519 - f1-score (micro avg) 0.7964 2023-10-25 21:26:19,220 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:26:20,792 epoch 6 - iter 27/272 - loss 0.02666730 - time (sec): 1.57 - samples/sec: 3471.94 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:26:22,333 epoch 6 - iter 54/272 - loss 0.01962865 - time (sec): 3.11 - samples/sec: 3446.68 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:26:23,857 epoch 6 - iter 81/272 - loss 0.01821303 - time (sec): 4.64 - samples/sec: 3337.98 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:26:25,433 epoch 6 - iter 108/272 - loss 0.02318386 - time (sec): 6.21 - samples/sec: 3341.21 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:26:26,946 epoch 6 - iter 135/272 - loss 0.02155185 - time (sec): 7.72 - samples/sec: 3320.13 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:26:28,462 epoch 6 - iter 162/272 - loss 0.02185104 - time (sec): 9.24 - samples/sec: 3366.02 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:26:29,970 epoch 6 - iter 189/272 - loss 0.02306003 - time (sec): 10.75 - samples/sec: 3410.78 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:26:31,474 epoch 6 - iter 216/272 - loss 0.02366588 - time (sec): 12.25 - samples/sec: 3337.30 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:26:32,976 epoch 6 - iter 243/272 - loss 0.02325498 - time (sec): 13.75 - samples/sec: 3382.56 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:26:34,454 epoch 6 - iter 270/272 - loss 0.02340385 - time (sec): 15.23 - samples/sec: 3385.88 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:26:34,566 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:26:34,566 EPOCH 6 done: loss 0.0232 - lr: 0.000013 2023-10-25 21:26:35,844 DEV : loss 0.14856071770191193 - f1-score (micro avg) 0.8118 2023-10-25 21:26:35,850 saving best model 2023-10-25 21:26:36,568 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:26:38,086 epoch 7 - iter 27/272 - loss 0.01152928 - time (sec): 1.51 - samples/sec: 3665.41 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:26:39,572 epoch 7 - iter 54/272 - loss 0.01071992 - time (sec): 3.00 - samples/sec: 3522.80 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:26:41,063 epoch 7 - iter 81/272 - loss 0.01059353 - time (sec): 4.49 - samples/sec: 3519.69 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:26:42,575 epoch 7 - iter 108/272 - loss 0.01206115 - time (sec): 6.00 - samples/sec: 3579.45 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:26:44,092 epoch 7 - iter 135/272 - loss 0.01324880 - time (sec): 7.52 - samples/sec: 3495.96 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:26:45,553 epoch 7 - iter 162/272 - loss 0.01362028 - time (sec): 8.98 - samples/sec: 3493.65 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:26:47,112 epoch 7 - iter 189/272 - loss 0.01636107 - time (sec): 10.54 - samples/sec: 3516.51 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:26:48,600 epoch 7 - iter 216/272 - loss 0.01678638 - time (sec): 12.03 - samples/sec: 3514.09 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:26:50,082 epoch 7 - iter 243/272 - loss 0.01800984 - time (sec): 13.51 - samples/sec: 3469.39 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:26:51,588 epoch 7 - iter 270/272 - loss 0.01706655 - time (sec): 15.02 - samples/sec: 3438.63 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:26:51,705 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:26:51,705 EPOCH 7 done: loss 0.0170 - lr: 0.000010 2023-10-25 21:26:52,958 DEV : loss 0.15049101412296295 - f1-score (micro avg) 0.817 2023-10-25 21:26:52,964 saving best model 2023-10-25 21:26:53,672 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:26:55,233 epoch 8 - iter 27/272 - loss 0.01955441 - time (sec): 1.56 - samples/sec: 3900.71 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:26:56,670 epoch 8 - iter 54/272 - loss 0.02251273 - time (sec): 3.00 - samples/sec: 3684.76 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:26:58,230 epoch 8 - iter 81/272 - loss 0.01870268 - time (sec): 4.56 - samples/sec: 3653.96 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:26:59,730 epoch 8 - iter 108/272 - loss 0.01802886 - time (sec): 6.06 - samples/sec: 3597.69 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:27:01,270 epoch 8 - iter 135/272 - loss 0.01675014 - time (sec): 7.60 - samples/sec: 3588.73 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:27:02,817 epoch 8 - iter 162/272 - loss 0.01636457 - time (sec): 9.14 - samples/sec: 3512.11 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:27:04,347 epoch 8 - iter 189/272 - loss 0.01653653 - time (sec): 10.67 - samples/sec: 3493.60 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:27:05,891 epoch 8 - iter 216/272 - loss 0.01494514 - time (sec): 12.22 - samples/sec: 3482.53 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:27:07,413 epoch 8 - iter 243/272 - loss 0.01358305 - time (sec): 13.74 - samples/sec: 3451.57 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:27:08,867 epoch 8 - iter 270/272 - loss 0.01344559 - time (sec): 15.19 - samples/sec: 3413.90 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:27:08,973 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:27:08,973 EPOCH 8 done: loss 0.0134 - lr: 0.000007 2023-10-25 21:27:10,570 DEV : loss 0.16718466579914093 - f1-score (micro avg) 0.8244 2023-10-25 21:27:10,576 saving best model 2023-10-25 21:27:11,271 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:27:12,843 epoch 9 - iter 27/272 - loss 0.00164870 - time (sec): 1.57 - samples/sec: 3747.02 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:27:14,311 epoch 9 - iter 54/272 - loss 0.00851133 - time (sec): 3.04 - samples/sec: 3601.08 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:27:15,778 epoch 9 - iter 81/272 - loss 0.00761523 - time (sec): 4.51 - samples/sec: 3449.84 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:27:17,240 epoch 9 - iter 108/272 - loss 0.00949545 - time (sec): 5.97 - samples/sec: 3405.25 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:27:18,722 epoch 9 - iter 135/272 - loss 0.00962367 - time (sec): 7.45 - samples/sec: 3446.74 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:27:20,186 epoch 9 - iter 162/272 - loss 0.00970501 - time (sec): 8.91 - samples/sec: 3481.51 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:27:21,676 epoch 9 - iter 189/272 - loss 0.00915997 - time (sec): 10.40 - samples/sec: 3534.96 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:27:23,133 epoch 9 - iter 216/272 - loss 0.00823915 - time (sec): 11.86 - samples/sec: 3493.39 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:27:24,602 epoch 9 - iter 243/272 - loss 0.00817118 - time (sec): 13.33 - samples/sec: 3507.63 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:27:26,104 epoch 9 - iter 270/272 - loss 0.01010203 - time (sec): 14.83 - samples/sec: 3495.59 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:27:26,198 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:27:26,199 EPOCH 9 done: loss 0.0101 - lr: 0.000003 2023-10-25 21:27:27,425 DEV : loss 0.16382966935634613 - f1-score (micro avg) 0.8183 2023-10-25 21:27:27,431 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:27:28,874 epoch 10 - iter 27/272 - loss 0.00996204 - time (sec): 1.44 - samples/sec: 3784.50 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:27:30,250 epoch 10 - iter 54/272 - loss 0.00822023 - time (sec): 2.82 - samples/sec: 3628.66 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:27:31,708 epoch 10 - iter 81/272 - loss 0.00718997 - time (sec): 4.28 - samples/sec: 3668.97 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:27:33,179 epoch 10 - iter 108/272 - loss 0.00636107 - time (sec): 5.75 - samples/sec: 3707.51 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:27:34,708 epoch 10 - iter 135/272 - loss 0.00714120 - time (sec): 7.28 - samples/sec: 3587.09 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:27:36,272 epoch 10 - iter 162/272 - loss 0.00626853 - time (sec): 8.84 - samples/sec: 3523.31 - lr: 0.000001 - momentum: 0.000000 2023-10-25 21:27:37,846 epoch 10 - iter 189/272 - loss 0.00615876 - time (sec): 10.41 - samples/sec: 3525.63 - lr: 0.000001 - momentum: 0.000000 2023-10-25 21:27:39,413 epoch 10 - iter 216/272 - loss 0.00562173 - time (sec): 11.98 - samples/sec: 3477.07 - lr: 0.000001 - momentum: 0.000000 2023-10-25 21:27:40,880 epoch 10 - iter 243/272 - loss 0.00638001 - time (sec): 13.45 - samples/sec: 3458.88 - lr: 0.000000 - momentum: 0.000000 2023-10-25 21:27:42,267 epoch 10 - iter 270/272 - loss 0.00607942 - time (sec): 14.83 - samples/sec: 3488.34 - lr: 0.000000 - momentum: 0.000000 2023-10-25 21:27:42,364 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:27:42,364 EPOCH 10 done: loss 0.0061 - lr: 0.000000 2023-10-25 21:27:43,597 DEV : loss 0.16699165105819702 - f1-score (micro avg) 0.8281 2023-10-25 21:27:43,604 saving best model 2023-10-25 21:27:44,815 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:27:44,817 Loading model from best epoch ... 2023-10-25 21:27:46,707 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-25 21:27:48,940 Results: - F-score (micro) 0.7769 - F-score (macro) 0.7273 - Accuracy 0.6538 By class: precision recall f1-score support LOC 0.8173 0.8462 0.8315 312 PER 0.6923 0.8654 0.7692 208 ORG 0.4643 0.4727 0.4685 55 HumanProd 0.7500 0.9545 0.8400 22 micro avg 0.7361 0.8224 0.7769 597 macro avg 0.6810 0.7847 0.7273 597 weighted avg 0.7388 0.8224 0.7767 597 2023-10-25 21:27:48,940 ----------------------------------------------------------------------------------------------------