2023-10-25 21:22:31,902 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:22:31,903 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 21:22:31,903 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:22:31,903 MultiCorpus: 1085 train + 148 dev + 364 test sentences - NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator 2023-10-25 21:22:31,903 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:22:31,903 Train: 1085 sentences 2023-10-25 21:22:31,903 (train_with_dev=False, train_with_test=False) 2023-10-25 21:22:31,903 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:22:31,903 Training Params: 2023-10-25 21:22:31,903 - learning_rate: "5e-05" 2023-10-25 21:22:31,903 - mini_batch_size: "8" 2023-10-25 21:22:31,903 - max_epochs: "10" 2023-10-25 21:22:31,903 - shuffle: "True" 2023-10-25 21:22:31,903 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:22:31,904 Plugins: 2023-10-25 21:22:31,904 - TensorboardLogger 2023-10-25 21:22:31,904 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 21:22:31,904 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:22:31,904 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 21:22:31,904 - metric: "('micro avg', 'f1-score')" 2023-10-25 21:22:31,904 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:22:31,904 Computation: 2023-10-25 21:22:31,904 - compute on device: cuda:0 2023-10-25 21:22:31,904 - embedding storage: none 2023-10-25 21:22:31,904 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:22:31,904 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-25 21:22:31,904 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:22:31,904 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:22:31,904 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 21:22:32,963 epoch 1 - iter 13/136 - loss 3.09896681 - time (sec): 1.06 - samples/sec: 4971.59 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:22:33,851 epoch 1 - iter 26/136 - loss 2.47193972 - time (sec): 1.95 - samples/sec: 5448.35 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:22:34,739 epoch 1 - iter 39/136 - loss 1.97440985 - time (sec): 2.83 - samples/sec: 5365.69 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:22:35,727 epoch 1 - iter 52/136 - loss 1.60287064 - time (sec): 3.82 - samples/sec: 5263.45 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:22:36,687 epoch 1 - iter 65/136 - loss 1.38982177 - time (sec): 4.78 - samples/sec: 5065.73 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:22:37,736 epoch 1 - iter 78/136 - loss 1.19122238 - time (sec): 5.83 - samples/sec: 5137.90 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:22:38,910 epoch 1 - iter 91/136 - loss 1.05383990 - time (sec): 7.01 - samples/sec: 5024.46 - lr: 0.000033 - momentum: 0.000000 2023-10-25 21:22:39,818 epoch 1 - iter 104/136 - loss 0.96075020 - time (sec): 7.91 - samples/sec: 5029.37 - lr: 0.000038 - momentum: 0.000000 2023-10-25 21:22:40,747 epoch 1 - iter 117/136 - loss 0.88175036 - time (sec): 8.84 - samples/sec: 5027.46 - lr: 0.000043 - momentum: 0.000000 2023-10-25 21:22:41,808 epoch 1 - iter 130/136 - loss 0.81359821 - time (sec): 9.90 - samples/sec: 5032.99 - lr: 0.000047 - momentum: 0.000000 2023-10-25 21:22:42,244 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:22:42,244 EPOCH 1 done: loss 0.7861 - lr: 0.000047 2023-10-25 21:22:43,416 DEV : loss 0.1567286252975464 - f1-score (micro avg) 0.6667 2023-10-25 21:22:43,422 saving best model 2023-10-25 21:22:43,922 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:22:44,828 epoch 2 - iter 13/136 - loss 0.15479314 - time (sec): 0.90 - samples/sec: 5122.73 - lr: 0.000050 - momentum: 0.000000 2023-10-25 21:22:45,884 epoch 2 - iter 26/136 - loss 0.16353171 - time (sec): 1.96 - samples/sec: 5090.38 - lr: 0.000049 - momentum: 0.000000 2023-10-25 21:22:46,917 epoch 2 - iter 39/136 - loss 0.17834152 - time (sec): 2.99 - samples/sec: 4770.56 - lr: 0.000048 - momentum: 0.000000 2023-10-25 21:22:47,974 epoch 2 - iter 52/136 - loss 0.18487150 - time (sec): 4.05 - samples/sec: 4971.38 - lr: 0.000048 - momentum: 0.000000 2023-10-25 21:22:49,007 epoch 2 - iter 65/136 - loss 0.17806542 - time (sec): 5.08 - samples/sec: 5028.22 - lr: 0.000047 - momentum: 0.000000 2023-10-25 21:22:49,975 epoch 2 - iter 78/136 - loss 0.18173881 - time (sec): 6.05 - samples/sec: 4984.22 - lr: 0.000047 - momentum: 0.000000 2023-10-25 21:22:51,088 epoch 2 - iter 91/136 - loss 0.17775518 - time (sec): 7.16 - samples/sec: 5019.83 - lr: 0.000046 - momentum: 0.000000 2023-10-25 21:22:52,125 epoch 2 - iter 104/136 - loss 0.17122320 - time (sec): 8.20 - samples/sec: 5069.36 - lr: 0.000046 - momentum: 0.000000 2023-10-25 21:22:53,011 epoch 2 - iter 117/136 - loss 0.16735079 - time (sec): 9.09 - samples/sec: 5044.82 - lr: 0.000045 - momentum: 0.000000 2023-10-25 21:22:53,975 epoch 2 - iter 130/136 - loss 0.17094941 - time (sec): 10.05 - samples/sec: 5012.09 - lr: 0.000045 - momentum: 0.000000 2023-10-25 21:22:54,364 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:22:54,365 EPOCH 2 done: loss 0.1728 - lr: 0.000045 2023-10-25 21:22:55,592 DEV : loss 0.1621260643005371 - f1-score (micro avg) 0.6223 2023-10-25 21:22:55,599 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:22:56,591 epoch 3 - iter 13/136 - loss 0.20659280 - time (sec): 0.99 - samples/sec: 4997.92 - lr: 0.000044 - momentum: 0.000000 2023-10-25 21:22:57,499 epoch 3 - iter 26/136 - loss 0.16584398 - time (sec): 1.90 - samples/sec: 5368.53 - lr: 0.000043 - momentum: 0.000000 2023-10-25 21:22:58,542 epoch 3 - iter 39/136 - loss 0.15400013 - time (sec): 2.94 - samples/sec: 5170.35 - lr: 0.000043 - momentum: 0.000000 2023-10-25 21:22:59,533 epoch 3 - iter 52/136 - loss 0.13409050 - time (sec): 3.93 - samples/sec: 5240.89 - lr: 0.000042 - momentum: 0.000000 2023-10-25 21:23:00,543 epoch 3 - iter 65/136 - loss 0.12610680 - time (sec): 4.94 - samples/sec: 5222.21 - lr: 0.000042 - momentum: 0.000000 2023-10-25 21:23:01,529 epoch 3 - iter 78/136 - loss 0.11908215 - time (sec): 5.93 - samples/sec: 5167.99 - lr: 0.000041 - momentum: 0.000000 2023-10-25 21:23:02,462 epoch 3 - iter 91/136 - loss 0.11495720 - time (sec): 6.86 - samples/sec: 5073.98 - lr: 0.000041 - momentum: 0.000000 2023-10-25 21:23:03,550 epoch 3 - iter 104/136 - loss 0.11135219 - time (sec): 7.95 - samples/sec: 5030.75 - lr: 0.000040 - momentum: 0.000000 2023-10-25 21:23:04,600 epoch 3 - iter 117/136 - loss 0.11206355 - time (sec): 9.00 - samples/sec: 5011.62 - lr: 0.000040 - momentum: 0.000000 2023-10-25 21:23:05,486 epoch 3 - iter 130/136 - loss 0.10930128 - time (sec): 9.89 - samples/sec: 4995.03 - lr: 0.000039 - momentum: 0.000000 2023-10-25 21:23:05,953 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:23:05,954 EPOCH 3 done: loss 0.1076 - lr: 0.000039 2023-10-25 21:23:07,353 DEV : loss 0.09914213418960571 - f1-score (micro avg) 0.7544 2023-10-25 21:23:07,358 saving best model 2023-10-25 21:23:08,006 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:23:08,960 epoch 4 - iter 13/136 - loss 0.04799237 - time (sec): 0.95 - samples/sec: 5326.71 - lr: 0.000038 - momentum: 0.000000 2023-10-25 21:23:09,864 epoch 4 - iter 26/136 - loss 0.05744887 - time (sec): 1.86 - samples/sec: 5258.12 - lr: 0.000038 - momentum: 0.000000 2023-10-25 21:23:10,956 epoch 4 - iter 39/136 - loss 0.05011840 - time (sec): 2.95 - samples/sec: 5364.96 - lr: 0.000037 - momentum: 0.000000 2023-10-25 21:23:11,915 epoch 4 - iter 52/136 - loss 0.06335648 - time (sec): 3.91 - samples/sec: 5281.09 - lr: 0.000037 - momentum: 0.000000 2023-10-25 21:23:12,949 epoch 4 - iter 65/136 - loss 0.05707390 - time (sec): 4.94 - samples/sec: 5229.29 - lr: 0.000036 - momentum: 0.000000 2023-10-25 21:23:13,947 epoch 4 - iter 78/136 - loss 0.05712659 - time (sec): 5.94 - samples/sec: 5146.40 - lr: 0.000036 - momentum: 0.000000 2023-10-25 21:23:14,885 epoch 4 - iter 91/136 - loss 0.06298637 - time (sec): 6.88 - samples/sec: 5192.18 - lr: 0.000035 - momentum: 0.000000 2023-10-25 21:23:15,832 epoch 4 - iter 104/136 - loss 0.05884088 - time (sec): 7.82 - samples/sec: 5184.34 - lr: 0.000035 - momentum: 0.000000 2023-10-25 21:23:16,753 epoch 4 - iter 117/136 - loss 0.05907822 - time (sec): 8.74 - samples/sec: 5180.77 - lr: 0.000034 - momentum: 0.000000 2023-10-25 21:23:17,753 epoch 4 - iter 130/136 - loss 0.05836541 - time (sec): 9.74 - samples/sec: 5144.96 - lr: 0.000034 - momentum: 0.000000 2023-10-25 21:23:18,168 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:23:18,169 EPOCH 4 done: loss 0.0574 - lr: 0.000034 2023-10-25 21:23:19,439 DEV : loss 0.11081506311893463 - f1-score (micro avg) 0.7933 2023-10-25 21:23:19,445 saving best model 2023-10-25 21:23:20,096 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:23:21,121 epoch 5 - iter 13/136 - loss 0.02247676 - time (sec): 1.02 - samples/sec: 4833.41 - lr: 0.000033 - momentum: 0.000000 2023-10-25 21:23:22,043 epoch 5 - iter 26/136 - loss 0.02563925 - time (sec): 1.95 - samples/sec: 4735.65 - lr: 0.000032 - momentum: 0.000000 2023-10-25 21:23:23,049 epoch 5 - iter 39/136 - loss 0.03505350 - time (sec): 2.95 - samples/sec: 4789.08 - lr: 0.000032 - momentum: 0.000000 2023-10-25 21:23:24,069 epoch 5 - iter 52/136 - loss 0.03318240 - time (sec): 3.97 - samples/sec: 4883.85 - lr: 0.000031 - momentum: 0.000000 2023-10-25 21:23:24,957 epoch 5 - iter 65/136 - loss 0.03056275 - time (sec): 4.86 - samples/sec: 4755.76 - lr: 0.000031 - momentum: 0.000000 2023-10-25 21:23:25,951 epoch 5 - iter 78/136 - loss 0.03285871 - time (sec): 5.85 - samples/sec: 4800.79 - lr: 0.000030 - momentum: 0.000000 2023-10-25 21:23:27,006 epoch 5 - iter 91/136 - loss 0.03506467 - time (sec): 6.91 - samples/sec: 4895.08 - lr: 0.000030 - momentum: 0.000000 2023-10-25 21:23:27,942 epoch 5 - iter 104/136 - loss 0.03246624 - time (sec): 7.84 - samples/sec: 4940.57 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:23:28,929 epoch 5 - iter 117/136 - loss 0.03207926 - time (sec): 8.83 - samples/sec: 4980.09 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:23:30,038 epoch 5 - iter 130/136 - loss 0.03457919 - time (sec): 9.94 - samples/sec: 5007.25 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:23:30,479 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:23:30,479 EPOCH 5 done: loss 0.0341 - lr: 0.000028 2023-10-25 21:23:31,951 DEV : loss 0.11888163536787033 - f1-score (micro avg) 0.7949 2023-10-25 21:23:31,956 saving best model 2023-10-25 21:23:32,607 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:23:33,712 epoch 6 - iter 13/136 - loss 0.02123620 - time (sec): 1.10 - samples/sec: 4702.84 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:23:34,631 epoch 6 - iter 26/136 - loss 0.01830822 - time (sec): 2.02 - samples/sec: 5115.63 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:23:35,614 epoch 6 - iter 39/136 - loss 0.02156613 - time (sec): 3.00 - samples/sec: 4949.99 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:23:36,746 epoch 6 - iter 52/136 - loss 0.02165435 - time (sec): 4.14 - samples/sec: 4761.05 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:23:37,721 epoch 6 - iter 65/136 - loss 0.01848345 - time (sec): 5.11 - samples/sec: 4801.07 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:23:38,751 epoch 6 - iter 78/136 - loss 0.01947350 - time (sec): 6.14 - samples/sec: 4893.75 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:23:39,686 epoch 6 - iter 91/136 - loss 0.02372842 - time (sec): 7.08 - samples/sec: 4928.04 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:23:40,697 epoch 6 - iter 104/136 - loss 0.02320998 - time (sec): 8.09 - samples/sec: 4912.87 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:23:41,673 epoch 6 - iter 117/136 - loss 0.02372198 - time (sec): 9.06 - samples/sec: 4909.61 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:23:42,638 epoch 6 - iter 130/136 - loss 0.02408145 - time (sec): 10.03 - samples/sec: 4908.33 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:23:43,073 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:23:43,074 EPOCH 6 done: loss 0.0240 - lr: 0.000023 2023-10-25 21:23:44,386 DEV : loss 0.13894939422607422 - f1-score (micro avg) 0.792 2023-10-25 21:23:44,392 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:23:45,351 epoch 7 - iter 13/136 - loss 0.01023683 - time (sec): 0.96 - samples/sec: 5393.57 - lr: 0.000022 - momentum: 0.000000 2023-10-25 21:23:46,420 epoch 7 - iter 26/136 - loss 0.00988703 - time (sec): 2.03 - samples/sec: 5056.61 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:23:47,430 epoch 7 - iter 39/136 - loss 0.00855606 - time (sec): 3.04 - samples/sec: 5059.97 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:23:48,440 epoch 7 - iter 52/136 - loss 0.00902452 - time (sec): 4.05 - samples/sec: 5093.50 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:23:49,587 epoch 7 - iter 65/136 - loss 0.01254680 - time (sec): 5.19 - samples/sec: 4986.34 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:23:50,586 epoch 7 - iter 78/136 - loss 0.01308189 - time (sec): 6.19 - samples/sec: 4947.44 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:23:51,687 epoch 7 - iter 91/136 - loss 0.01516775 - time (sec): 7.29 - samples/sec: 4887.76 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:23:52,629 epoch 7 - iter 104/136 - loss 0.01675394 - time (sec): 8.24 - samples/sec: 4928.48 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:23:53,581 epoch 7 - iter 117/136 - loss 0.01755872 - time (sec): 9.19 - samples/sec: 4896.27 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:23:54,563 epoch 7 - iter 130/136 - loss 0.01838104 - time (sec): 10.17 - samples/sec: 4906.70 - lr: 0.000017 - momentum: 0.000000 2023-10-25 21:23:55,045 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:23:55,046 EPOCH 7 done: loss 0.0180 - lr: 0.000017 2023-10-25 21:23:56,352 DEV : loss 0.14390961825847626 - f1-score (micro avg) 0.8007 2023-10-25 21:23:56,359 saving best model 2023-10-25 21:23:57,482 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:23:58,480 epoch 8 - iter 13/136 - loss 0.01707130 - time (sec): 1.00 - samples/sec: 5902.42 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:23:59,424 epoch 8 - iter 26/136 - loss 0.01715658 - time (sec): 1.94 - samples/sec: 5637.99 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:24:00,502 epoch 8 - iter 39/136 - loss 0.01607778 - time (sec): 3.02 - samples/sec: 5407.40 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:24:01,409 epoch 8 - iter 52/136 - loss 0.01464815 - time (sec): 3.92 - samples/sec: 5338.78 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:24:02,478 epoch 8 - iter 65/136 - loss 0.01481962 - time (sec): 4.99 - samples/sec: 5222.95 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:24:03,465 epoch 8 - iter 78/136 - loss 0.01416660 - time (sec): 5.98 - samples/sec: 5196.59 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:24:04,492 epoch 8 - iter 91/136 - loss 0.01363303 - time (sec): 7.01 - samples/sec: 5146.20 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:24:05,474 epoch 8 - iter 104/136 - loss 0.01327422 - time (sec): 7.99 - samples/sec: 5138.33 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:24:06,500 epoch 8 - iter 117/136 - loss 0.01229904 - time (sec): 9.02 - samples/sec: 5073.80 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:24:07,552 epoch 8 - iter 130/136 - loss 0.01172602 - time (sec): 10.07 - samples/sec: 4985.49 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:24:07,981 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:24:07,981 EPOCH 8 done: loss 0.0120 - lr: 0.000012 2023-10-25 21:24:09,205 DEV : loss 0.14648671448230743 - f1-score (micro avg) 0.8059 2023-10-25 21:24:09,212 saving best model 2023-10-25 21:24:09,882 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:24:10,933 epoch 9 - iter 13/136 - loss 0.00211235 - time (sec): 1.05 - samples/sec: 4725.27 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:24:11,954 epoch 9 - iter 26/136 - loss 0.00666773 - time (sec): 2.07 - samples/sec: 5156.50 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:24:12,849 epoch 9 - iter 39/136 - loss 0.00638834 - time (sec): 2.96 - samples/sec: 5192.18 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:24:13,723 epoch 9 - iter 52/136 - loss 0.00686059 - time (sec): 3.84 - samples/sec: 5093.48 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:24:14,702 epoch 9 - iter 65/136 - loss 0.00731458 - time (sec): 4.82 - samples/sec: 5200.74 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:24:15,617 epoch 9 - iter 78/136 - loss 0.00764700 - time (sec): 5.73 - samples/sec: 5173.20 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:24:16,706 epoch 9 - iter 91/136 - loss 0.00708190 - time (sec): 6.82 - samples/sec: 5227.13 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:24:17,600 epoch 9 - iter 104/136 - loss 0.00681427 - time (sec): 7.71 - samples/sec: 5166.31 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:24:18,567 epoch 9 - iter 117/136 - loss 0.00709413 - time (sec): 8.68 - samples/sec: 5121.21 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:24:19,601 epoch 9 - iter 130/136 - loss 0.00726556 - time (sec): 9.72 - samples/sec: 5093.54 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:24:20,105 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:24:20,105 EPOCH 9 done: loss 0.0075 - lr: 0.000006 2023-10-25 21:24:21,277 DEV : loss 0.14979466795921326 - f1-score (micro avg) 0.8222 2023-10-25 21:24:21,284 saving best model 2023-10-25 21:24:21,932 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:24:22,991 epoch 10 - iter 13/136 - loss 0.00573592 - time (sec): 1.06 - samples/sec: 4604.37 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:24:24,354 epoch 10 - iter 26/136 - loss 0.00771814 - time (sec): 2.42 - samples/sec: 4054.98 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:24:25,410 epoch 10 - iter 39/136 - loss 0.00673013 - time (sec): 3.48 - samples/sec: 4301.76 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:24:26,449 epoch 10 - iter 52/136 - loss 0.00561897 - time (sec): 4.51 - samples/sec: 4588.30 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:24:27,420 epoch 10 - iter 65/136 - loss 0.00554251 - time (sec): 5.49 - samples/sec: 4623.66 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:24:28,326 epoch 10 - iter 78/136 - loss 0.00502923 - time (sec): 6.39 - samples/sec: 4684.23 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:24:29,339 epoch 10 - iter 91/136 - loss 0.00506657 - time (sec): 7.40 - samples/sec: 4781.52 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:24:30,377 epoch 10 - iter 104/136 - loss 0.00484601 - time (sec): 8.44 - samples/sec: 4779.86 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:24:31,227 epoch 10 - iter 117/136 - loss 0.00515945 - time (sec): 9.29 - samples/sec: 4795.92 - lr: 0.000001 - momentum: 0.000000 2023-10-25 21:24:32,261 epoch 10 - iter 130/136 - loss 0.00489036 - time (sec): 10.33 - samples/sec: 4816.89 - lr: 0.000000 - momentum: 0.000000 2023-10-25 21:24:32,674 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:24:32,674 EPOCH 10 done: loss 0.0052 - lr: 0.000000 2023-10-25 21:24:33,806 DEV : loss 0.15492050349712372 - f1-score (micro avg) 0.8177 2023-10-25 21:24:34,274 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:24:34,276 Loading model from best epoch ... 2023-10-25 21:24:36,160 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-25 21:24:38,129 Results: - F-score (micro) 0.7872 - F-score (macro) 0.739 - Accuracy 0.6627 By class: precision recall f1-score support LOC 0.8081 0.8910 0.8476 312 PER 0.6716 0.8654 0.7563 208 ORG 0.5714 0.4364 0.4948 55 HumanProd 0.7778 0.9545 0.8571 22 micro avg 0.7386 0.8425 0.7872 597 macro avg 0.7072 0.7868 0.7390 597 weighted avg 0.7377 0.8425 0.7836 597 2023-10-25 21:24:38,129 ----------------------------------------------------------------------------------------------------