2023-11-16 08:54:28,192 ---------------------------------------------------------------------------------------------------- 2023-11-16 08:54:28,194 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): XLMRobertaModel( (embeddings): XLMRobertaEmbeddings( (word_embeddings): Embedding(250003, 1024) (position_embeddings): Embedding(514, 1024, padding_idx=1) (token_type_embeddings): Embedding(1, 1024) (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): XLMRobertaEncoder( (layer): ModuleList( (0-23): 24 x XLMRobertaLayer( (attention): XLMRobertaAttention( (self): XLMRobertaSelfAttention( (query): Linear(in_features=1024, out_features=1024, bias=True) (key): Linear(in_features=1024, out_features=1024, bias=True) (value): Linear(in_features=1024, out_features=1024, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): XLMRobertaSelfOutput( (dense): Linear(in_features=1024, out_features=1024, bias=True) (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): XLMRobertaIntermediate( (dense): Linear(in_features=1024, out_features=4096, bias=True) (intermediate_act_fn): GELUActivation() ) (output): XLMRobertaOutput( (dense): Linear(in_features=4096, out_features=1024, bias=True) (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): XLMRobertaPooler( (dense): Linear(in_features=1024, out_features=1024, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=1024, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-11-16 08:54:28,194 ---------------------------------------------------------------------------------------------------- 2023-11-16 08:54:28,194 MultiCorpus: 30000 train + 10000 dev + 10000 test sentences - ColumnCorpus Corpus: 20000 train + 0 dev + 0 test sentences - /root/.flair/datasets/ner_multi_xtreme/en - ColumnCorpus Corpus: 10000 train + 10000 dev + 10000 test sentences - /root/.flair/datasets/ner_multi_xtreme/ka 2023-11-16 08:54:28,194 ---------------------------------------------------------------------------------------------------- 2023-11-16 08:54:28,194 Train: 30000 sentences 2023-11-16 08:54:28,194 (train_with_dev=False, train_with_test=False) 2023-11-16 08:54:28,194 ---------------------------------------------------------------------------------------------------- 2023-11-16 08:54:28,194 Training Params: 2023-11-16 08:54:28,194 - learning_rate: "5e-06" 2023-11-16 08:54:28,194 - mini_batch_size: "4" 2023-11-16 08:54:28,194 - max_epochs: "10" 2023-11-16 08:54:28,194 - shuffle: "True" 2023-11-16 08:54:28,194 ---------------------------------------------------------------------------------------------------- 2023-11-16 08:54:28,194 Plugins: 2023-11-16 08:54:28,194 - TensorboardLogger 2023-11-16 08:54:28,194 - LinearScheduler | warmup_fraction: '0.1' 2023-11-16 08:54:28,194 ---------------------------------------------------------------------------------------------------- 2023-11-16 08:54:28,194 Final evaluation on model from best epoch (best-model.pt) 2023-11-16 08:54:28,194 - metric: "('micro avg', 'f1-score')" 2023-11-16 08:54:28,194 ---------------------------------------------------------------------------------------------------- 2023-11-16 08:54:28,195 Computation: 2023-11-16 08:54:28,195 - compute on device: cuda:0 2023-11-16 08:54:28,195 - embedding storage: none 2023-11-16 08:54:28,195 ---------------------------------------------------------------------------------------------------- 2023-11-16 08:54:28,195 Model training base path: "autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-5" 2023-11-16 08:54:28,195 ---------------------------------------------------------------------------------------------------- 2023-11-16 08:54:28,195 ---------------------------------------------------------------------------------------------------- 2023-11-16 08:54:28,195 Logging anything other than scalars to TensorBoard is currently not supported. 2023-11-16 08:56:03,042 epoch 1 - iter 750/7500 - loss 2.73514177 - time (sec): 94.85 - samples/sec: 251.28 - lr: 0.000000 - momentum: 0.000000 2023-11-16 08:57:34,557 epoch 1 - iter 1500/7500 - loss 2.27024326 - time (sec): 186.36 - samples/sec: 258.14 - lr: 0.000001 - momentum: 0.000000 2023-11-16 08:59:07,587 epoch 1 - iter 2250/7500 - loss 1.96361468 - time (sec): 279.39 - samples/sec: 259.21 - lr: 0.000001 - momentum: 0.000000 2023-11-16 09:00:41,392 epoch 1 - iter 3000/7500 - loss 1.71842298 - time (sec): 373.20 - samples/sec: 259.19 - lr: 0.000002 - momentum: 0.000000 2023-11-16 09:02:13,057 epoch 1 - iter 3750/7500 - loss 1.50867437 - time (sec): 464.86 - samples/sec: 260.24 - lr: 0.000002 - momentum: 0.000000 2023-11-16 09:03:44,283 epoch 1 - iter 4500/7500 - loss 1.34975880 - time (sec): 556.09 - samples/sec: 261.24 - lr: 0.000003 - momentum: 0.000000 2023-11-16 09:05:17,789 epoch 1 - iter 5250/7500 - loss 1.23153815 - time (sec): 649.59 - samples/sec: 261.16 - lr: 0.000003 - momentum: 0.000000 2023-11-16 09:06:51,967 epoch 1 - iter 6000/7500 - loss 1.13901611 - time (sec): 743.77 - samples/sec: 260.11 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:08:25,989 epoch 1 - iter 6750/7500 - loss 1.06518907 - time (sec): 837.79 - samples/sec: 259.14 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:10:02,189 epoch 1 - iter 7500/7500 - loss 1.00335008 - time (sec): 933.99 - samples/sec: 257.81 - lr: 0.000005 - momentum: 0.000000 2023-11-16 09:10:02,191 ---------------------------------------------------------------------------------------------------- 2023-11-16 09:10:02,191 EPOCH 1 done: loss 1.0034 - lr: 0.000005 2023-11-16 09:10:29,151 DEV : loss 0.25209933519363403 - f1-score (micro avg) 0.818 2023-11-16 09:10:30,799 saving best model 2023-11-16 09:10:32,550 ---------------------------------------------------------------------------------------------------- 2023-11-16 09:12:05,182 epoch 2 - iter 750/7500 - loss 0.40758191 - time (sec): 92.63 - samples/sec: 255.89 - lr: 0.000005 - momentum: 0.000000 2023-11-16 09:13:40,646 epoch 2 - iter 1500/7500 - loss 0.41354608 - time (sec): 188.09 - samples/sec: 252.51 - lr: 0.000005 - momentum: 0.000000 2023-11-16 09:15:13,127 epoch 2 - iter 2250/7500 - loss 0.41469889 - time (sec): 280.57 - samples/sec: 256.53 - lr: 0.000005 - momentum: 0.000000 2023-11-16 09:16:46,617 epoch 2 - iter 3000/7500 - loss 0.40864404 - time (sec): 374.06 - samples/sec: 257.65 - lr: 0.000005 - momentum: 0.000000 2023-11-16 09:18:19,301 epoch 2 - iter 3750/7500 - loss 0.40728475 - time (sec): 466.75 - samples/sec: 258.79 - lr: 0.000005 - momentum: 0.000000 2023-11-16 09:19:50,423 epoch 2 - iter 4500/7500 - loss 0.40431483 - time (sec): 557.87 - samples/sec: 259.91 - lr: 0.000005 - momentum: 0.000000 2023-11-16 09:21:22,320 epoch 2 - iter 5250/7500 - loss 0.40010819 - time (sec): 649.77 - samples/sec: 260.46 - lr: 0.000005 - momentum: 0.000000 2023-11-16 09:22:53,693 epoch 2 - iter 6000/7500 - loss 0.39994412 - time (sec): 741.14 - samples/sec: 260.74 - lr: 0.000005 - momentum: 0.000000 2023-11-16 09:24:27,071 epoch 2 - iter 6750/7500 - loss 0.40164576 - time (sec): 834.52 - samples/sec: 260.23 - lr: 0.000005 - momentum: 0.000000 2023-11-16 09:26:00,518 epoch 2 - iter 7500/7500 - loss 0.40139034 - time (sec): 927.97 - samples/sec: 259.49 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:26:00,523 ---------------------------------------------------------------------------------------------------- 2023-11-16 09:26:00,523 EPOCH 2 done: loss 0.4014 - lr: 0.000004 2023-11-16 09:26:28,590 DEV : loss 0.27011075615882874 - f1-score (micro avg) 0.8685 2023-11-16 09:26:30,477 saving best model 2023-11-16 09:26:32,489 ---------------------------------------------------------------------------------------------------- 2023-11-16 09:28:06,865 epoch 3 - iter 750/7500 - loss 0.34664813 - time (sec): 94.37 - samples/sec: 259.00 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:29:40,525 epoch 3 - iter 1500/7500 - loss 0.35802918 - time (sec): 188.03 - samples/sec: 256.41 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:31:13,248 epoch 3 - iter 2250/7500 - loss 0.34949160 - time (sec): 280.76 - samples/sec: 259.11 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:32:44,431 epoch 3 - iter 3000/7500 - loss 0.34400653 - time (sec): 371.94 - samples/sec: 261.74 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:34:18,607 epoch 3 - iter 3750/7500 - loss 0.34736997 - time (sec): 466.12 - samples/sec: 259.21 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:35:51,979 epoch 3 - iter 4500/7500 - loss 0.34780959 - time (sec): 559.49 - samples/sec: 258.64 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:37:25,102 epoch 3 - iter 5250/7500 - loss 0.34688026 - time (sec): 652.61 - samples/sec: 258.12 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:38:58,439 epoch 3 - iter 6000/7500 - loss 0.34475842 - time (sec): 745.95 - samples/sec: 257.81 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:40:34,113 epoch 3 - iter 6750/7500 - loss 0.34349763 - time (sec): 841.62 - samples/sec: 256.89 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:42:10,676 epoch 3 - iter 7500/7500 - loss 0.34117216 - time (sec): 938.18 - samples/sec: 256.66 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:42:10,679 ---------------------------------------------------------------------------------------------------- 2023-11-16 09:42:10,679 EPOCH 3 done: loss 0.3412 - lr: 0.000004 2023-11-16 09:42:37,676 DEV : loss 0.2926769554615021 - f1-score (micro avg) 0.8843 2023-11-16 09:42:39,581 saving best model 2023-11-16 09:42:41,582 ---------------------------------------------------------------------------------------------------- 2023-11-16 09:44:14,986 epoch 4 - iter 750/7500 - loss 0.28935096 - time (sec): 93.40 - samples/sec: 255.70 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:45:47,989 epoch 4 - iter 1500/7500 - loss 0.29093752 - time (sec): 186.40 - samples/sec: 257.90 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:47:19,359 epoch 4 - iter 2250/7500 - loss 0.29888625 - time (sec): 277.77 - samples/sec: 259.38 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:48:51,599 epoch 4 - iter 3000/7500 - loss 0.30114670 - time (sec): 370.01 - samples/sec: 258.77 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:50:23,345 epoch 4 - iter 3750/7500 - loss 0.30007515 - time (sec): 461.76 - samples/sec: 259.19 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:51:56,749 epoch 4 - iter 4500/7500 - loss 0.29780444 - time (sec): 555.16 - samples/sec: 259.74 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:53:29,783 epoch 4 - iter 5250/7500 - loss 0.29632423 - time (sec): 648.20 - samples/sec: 260.29 - lr: 0.000004 - momentum: 0.000000 2023-11-16 09:55:03,332 epoch 4 - iter 6000/7500 - loss 0.29715629 - time (sec): 741.75 - samples/sec: 259.64 - lr: 0.000003 - momentum: 0.000000 2023-11-16 09:56:39,271 epoch 4 - iter 6750/7500 - loss 0.29789543 - time (sec): 837.69 - samples/sec: 258.76 - lr: 0.000003 - momentum: 0.000000 2023-11-16 09:58:14,979 epoch 4 - iter 7500/7500 - loss 0.30080354 - time (sec): 933.39 - samples/sec: 257.98 - lr: 0.000003 - momentum: 0.000000 2023-11-16 09:58:14,982 ---------------------------------------------------------------------------------------------------- 2023-11-16 09:58:14,982 EPOCH 4 done: loss 0.3008 - lr: 0.000003 2023-11-16 09:58:42,391 DEV : loss 0.24168777465820312 - f1-score (micro avg) 0.8958 2023-11-16 09:58:44,896 saving best model 2023-11-16 09:58:47,890 ---------------------------------------------------------------------------------------------------- 2023-11-16 10:00:21,193 epoch 5 - iter 750/7500 - loss 0.24325762 - time (sec): 93.30 - samples/sec: 254.74 - lr: 0.000003 - momentum: 0.000000 2023-11-16 10:01:54,947 epoch 5 - iter 1500/7500 - loss 0.24699916 - time (sec): 187.05 - samples/sec: 256.42 - lr: 0.000003 - momentum: 0.000000 2023-11-16 10:03:33,373 epoch 5 - iter 2250/7500 - loss 0.24105182 - time (sec): 285.48 - samples/sec: 251.76 - lr: 0.000003 - momentum: 0.000000 2023-11-16 10:05:10,591 epoch 5 - iter 3000/7500 - loss 0.24548635 - time (sec): 382.70 - samples/sec: 250.75 - lr: 0.000003 - momentum: 0.000000 2023-11-16 10:06:45,160 epoch 5 - iter 3750/7500 - loss 0.24697996 - time (sec): 477.27 - samples/sec: 252.00 - lr: 0.000003 - momentum: 0.000000 2023-11-16 10:08:18,840 epoch 5 - iter 4500/7500 - loss 0.24902921 - time (sec): 570.95 - samples/sec: 252.17 - lr: 0.000003 - momentum: 0.000000 2023-11-16 10:09:54,495 epoch 5 - iter 5250/7500 - loss 0.24900570 - time (sec): 666.60 - samples/sec: 251.98 - lr: 0.000003 - momentum: 0.000000 2023-11-16 10:11:28,132 epoch 5 - iter 6000/7500 - loss 0.25246330 - time (sec): 760.24 - samples/sec: 253.26 - lr: 0.000003 - momentum: 0.000000 2023-11-16 10:13:02,532 epoch 5 - iter 6750/7500 - loss 0.25090384 - time (sec): 854.64 - samples/sec: 253.77 - lr: 0.000003 - momentum: 0.000000 2023-11-16 10:14:33,980 epoch 5 - iter 7500/7500 - loss 0.25096782 - time (sec): 946.09 - samples/sec: 254.52 - lr: 0.000003 - momentum: 0.000000 2023-11-16 10:14:33,983 ---------------------------------------------------------------------------------------------------- 2023-11-16 10:14:33,983 EPOCH 5 done: loss 0.2510 - lr: 0.000003 2023-11-16 10:15:01,770 DEV : loss 0.30133897066116333 - f1-score (micro avg) 0.8909 2023-11-16 10:15:04,505 ---------------------------------------------------------------------------------------------------- 2023-11-16 10:16:43,786 epoch 6 - iter 750/7500 - loss 0.21043668 - time (sec): 99.28 - samples/sec: 246.62 - lr: 0.000003 - momentum: 0.000000 2023-11-16 10:18:21,438 epoch 6 - iter 1500/7500 - loss 0.21551137 - time (sec): 196.93 - samples/sec: 244.08 - lr: 0.000003 - momentum: 0.000000 2023-11-16 10:19:58,967 epoch 6 - iter 2250/7500 - loss 0.21671764 - time (sec): 294.46 - samples/sec: 244.90 - lr: 0.000003 - momentum: 0.000000 2023-11-16 10:21:36,357 epoch 6 - iter 3000/7500 - loss 0.21410789 - time (sec): 391.85 - samples/sec: 245.03 - lr: 0.000003 - momentum: 0.000000 2023-11-16 10:23:11,820 epoch 6 - iter 3750/7500 - loss 0.21190957 - time (sec): 487.31 - samples/sec: 246.63 - lr: 0.000003 - momentum: 0.000000 2023-11-16 10:24:44,519 epoch 6 - iter 4500/7500 - loss 0.21724671 - time (sec): 580.01 - samples/sec: 248.11 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:26:19,240 epoch 6 - iter 5250/7500 - loss 0.21517436 - time (sec): 674.73 - samples/sec: 249.52 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:27:52,483 epoch 6 - iter 6000/7500 - loss 0.21502133 - time (sec): 767.97 - samples/sec: 250.59 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:29:26,618 epoch 6 - iter 6750/7500 - loss 0.21284089 - time (sec): 862.11 - samples/sec: 251.34 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:31:02,401 epoch 6 - iter 7500/7500 - loss 0.21376183 - time (sec): 957.89 - samples/sec: 251.38 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:31:02,403 ---------------------------------------------------------------------------------------------------- 2023-11-16 10:31:02,404 EPOCH 6 done: loss 0.2138 - lr: 0.000002 2023-11-16 10:31:29,893 DEV : loss 0.2858603894710541 - f1-score (micro avg) 0.9013 2023-11-16 10:31:32,354 saving best model 2023-11-16 10:31:34,717 ---------------------------------------------------------------------------------------------------- 2023-11-16 10:33:10,947 epoch 7 - iter 750/7500 - loss 0.18560997 - time (sec): 96.22 - samples/sec: 252.05 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:34:45,274 epoch 7 - iter 1500/7500 - loss 0.18388762 - time (sec): 190.55 - samples/sec: 254.81 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:36:20,608 epoch 7 - iter 2250/7500 - loss 0.17300910 - time (sec): 285.89 - samples/sec: 254.67 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:37:56,122 epoch 7 - iter 3000/7500 - loss 0.18185895 - time (sec): 381.40 - samples/sec: 253.44 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:39:29,201 epoch 7 - iter 3750/7500 - loss 0.18240739 - time (sec): 474.48 - samples/sec: 253.00 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:41:03,810 epoch 7 - iter 4500/7500 - loss 0.18167213 - time (sec): 569.09 - samples/sec: 253.57 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:42:34,871 epoch 7 - iter 5250/7500 - loss 0.18305956 - time (sec): 660.15 - samples/sec: 255.17 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:44:06,443 epoch 7 - iter 6000/7500 - loss 0.18397991 - time (sec): 751.72 - samples/sec: 256.24 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:45:40,244 epoch 7 - iter 6750/7500 - loss 0.18284928 - time (sec): 845.52 - samples/sec: 256.33 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:47:13,133 epoch 7 - iter 7500/7500 - loss 0.18356346 - time (sec): 938.41 - samples/sec: 256.60 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:47:13,135 ---------------------------------------------------------------------------------------------------- 2023-11-16 10:47:13,136 EPOCH 7 done: loss 0.1836 - lr: 0.000002 2023-11-16 10:47:40,667 DEV : loss 0.3011305034160614 - f1-score (micro avg) 0.8987 2023-11-16 10:47:42,779 ---------------------------------------------------------------------------------------------------- 2023-11-16 10:49:16,539 epoch 8 - iter 750/7500 - loss 0.14330999 - time (sec): 93.76 - samples/sec: 256.81 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:50:50,862 epoch 8 - iter 1500/7500 - loss 0.14160047 - time (sec): 188.08 - samples/sec: 252.69 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:52:23,414 epoch 8 - iter 2250/7500 - loss 0.14478770 - time (sec): 280.63 - samples/sec: 256.60 - lr: 0.000002 - momentum: 0.000000 2023-11-16 10:53:56,497 epoch 8 - iter 3000/7500 - loss 0.14930840 - time (sec): 373.71 - samples/sec: 258.22 - lr: 0.000001 - momentum: 0.000000 2023-11-16 10:55:28,192 epoch 8 - iter 3750/7500 - loss 0.15175926 - time (sec): 465.41 - samples/sec: 258.55 - lr: 0.000001 - momentum: 0.000000 2023-11-16 10:57:00,742 epoch 8 - iter 4500/7500 - loss 0.15482872 - time (sec): 557.96 - samples/sec: 259.33 - lr: 0.000001 - momentum: 0.000000 2023-11-16 10:58:32,426 epoch 8 - iter 5250/7500 - loss 0.15110368 - time (sec): 649.64 - samples/sec: 259.55 - lr: 0.000001 - momentum: 0.000000 2023-11-16 11:00:02,971 epoch 8 - iter 6000/7500 - loss 0.15064249 - time (sec): 740.19 - samples/sec: 260.02 - lr: 0.000001 - momentum: 0.000000 2023-11-16 11:01:35,563 epoch 8 - iter 6750/7500 - loss 0.15119609 - time (sec): 832.78 - samples/sec: 260.42 - lr: 0.000001 - momentum: 0.000000 2023-11-16 11:03:09,780 epoch 8 - iter 7500/7500 - loss 0.15276122 - time (sec): 927.00 - samples/sec: 259.76 - lr: 0.000001 - momentum: 0.000000 2023-11-16 11:03:09,783 ---------------------------------------------------------------------------------------------------- 2023-11-16 11:03:09,783 EPOCH 8 done: loss 0.1528 - lr: 0.000001 2023-11-16 11:03:37,801 DEV : loss 0.31595587730407715 - f1-score (micro avg) 0.9048 2023-11-16 11:03:40,424 saving best model 2023-11-16 11:03:43,075 ---------------------------------------------------------------------------------------------------- 2023-11-16 11:05:19,854 epoch 9 - iter 750/7500 - loss 0.12299891 - time (sec): 96.77 - samples/sec: 253.49 - lr: 0.000001 - momentum: 0.000000 2023-11-16 11:06:56,996 epoch 9 - iter 1500/7500 - loss 0.12314833 - time (sec): 193.92 - samples/sec: 248.21 - lr: 0.000001 - momentum: 0.000000 2023-11-16 11:08:31,883 epoch 9 - iter 2250/7500 - loss 0.13008764 - time (sec): 288.80 - samples/sec: 250.26 - lr: 0.000001 - momentum: 0.000000 2023-11-16 11:10:08,385 epoch 9 - iter 3000/7500 - loss 0.13080561 - time (sec): 385.30 - samples/sec: 250.44 - lr: 0.000001 - momentum: 0.000000 2023-11-16 11:11:42,064 epoch 9 - iter 3750/7500 - loss 0.13273049 - time (sec): 478.98 - samples/sec: 251.53 - lr: 0.000001 - momentum: 0.000000 2023-11-16 11:13:16,037 epoch 9 - iter 4500/7500 - loss 0.13360348 - time (sec): 572.96 - samples/sec: 251.77 - lr: 0.000001 - momentum: 0.000000 2023-11-16 11:14:49,720 epoch 9 - iter 5250/7500 - loss 0.13299160 - time (sec): 666.64 - samples/sec: 252.02 - lr: 0.000001 - momentum: 0.000000 2023-11-16 11:16:23,602 epoch 9 - iter 6000/7500 - loss 0.13133134 - time (sec): 760.52 - samples/sec: 252.84 - lr: 0.000001 - momentum: 0.000000 2023-11-16 11:17:56,438 epoch 9 - iter 6750/7500 - loss 0.13363917 - time (sec): 853.36 - samples/sec: 253.29 - lr: 0.000001 - momentum: 0.000000 2023-11-16 11:19:29,919 epoch 9 - iter 7500/7500 - loss 0.13122225 - time (sec): 946.84 - samples/sec: 254.32 - lr: 0.000001 - momentum: 0.000000 2023-11-16 11:19:29,921 ---------------------------------------------------------------------------------------------------- 2023-11-16 11:19:29,921 EPOCH 9 done: loss 0.1312 - lr: 0.000001 2023-11-16 11:19:57,777 DEV : loss 0.33471381664276123 - f1-score (micro avg) 0.9024 2023-11-16 11:20:00,060 ---------------------------------------------------------------------------------------------------- 2023-11-16 11:21:34,000 epoch 10 - iter 750/7500 - loss 0.11743559 - time (sec): 93.94 - samples/sec: 260.10 - lr: 0.000001 - momentum: 0.000000 2023-11-16 11:23:09,621 epoch 10 - iter 1500/7500 - loss 0.12274941 - time (sec): 189.56 - samples/sec: 254.16 - lr: 0.000000 - momentum: 0.000000 2023-11-16 11:24:43,426 epoch 10 - iter 2250/7500 - loss 0.11588386 - time (sec): 283.36 - samples/sec: 255.95 - lr: 0.000000 - momentum: 0.000000 2023-11-16 11:26:20,492 epoch 10 - iter 3000/7500 - loss 0.11238624 - time (sec): 380.43 - samples/sec: 255.77 - lr: 0.000000 - momentum: 0.000000 2023-11-16 11:27:52,662 epoch 10 - iter 3750/7500 - loss 0.11433172 - time (sec): 472.60 - samples/sec: 256.50 - lr: 0.000000 - momentum: 0.000000 2023-11-16 11:29:26,109 epoch 10 - iter 4500/7500 - loss 0.11525478 - time (sec): 566.05 - samples/sec: 256.49 - lr: 0.000000 - momentum: 0.000000 2023-11-16 11:30:58,606 epoch 10 - iter 5250/7500 - loss 0.11793983 - time (sec): 658.54 - samples/sec: 256.98 - lr: 0.000000 - momentum: 0.000000 2023-11-16 11:32:30,455 epoch 10 - iter 6000/7500 - loss 0.11586937 - time (sec): 750.39 - samples/sec: 257.62 - lr: 0.000000 - momentum: 0.000000 2023-11-16 11:34:03,956 epoch 10 - iter 6750/7500 - loss 0.11407474 - time (sec): 843.89 - samples/sec: 257.08 - lr: 0.000000 - momentum: 0.000000 2023-11-16 11:35:37,383 epoch 10 - iter 7500/7500 - loss 0.11399531 - time (sec): 937.32 - samples/sec: 256.90 - lr: 0.000000 - momentum: 0.000000 2023-11-16 11:35:37,386 ---------------------------------------------------------------------------------------------------- 2023-11-16 11:35:37,386 EPOCH 10 done: loss 0.1140 - lr: 0.000000 2023-11-16 11:36:05,240 DEV : loss 0.3250023126602173 - f1-score (micro avg) 0.9045 2023-11-16 11:36:08,837 ---------------------------------------------------------------------------------------------------- 2023-11-16 11:36:08,840 Loading model from best epoch ... 2023-11-16 11:36:16,806 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-PER, B-PER, E-PER, I-PER 2023-11-16 11:36:45,465 Results: - F-score (micro) 0.9042 - F-score (macro) 0.9031 - Accuracy 0.8544 By class: precision recall f1-score support LOC 0.9076 0.9106 0.9091 5288 PER 0.9288 0.9419 0.9353 3962 ORG 0.8606 0.8695 0.8650 3807 micro avg 0.9004 0.9081 0.9042 13057 macro avg 0.8990 0.9073 0.9031 13057 weighted avg 0.9004 0.9081 0.9042 13057 2023-11-16 11:36:45,466 ----------------------------------------------------------------------------------------------------