2023-10-25 12:55:17,372 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:55:17,373 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 12:55:17,373 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:55:17,373 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-25 12:55:17,373 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:55:17,373 Train: 6183 sentences 2023-10-25 12:55:17,373 (train_with_dev=False, train_with_test=False) 2023-10-25 12:55:17,374 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:55:17,374 Training Params: 2023-10-25 12:55:17,374 - learning_rate: "3e-05" 2023-10-25 12:55:17,374 - mini_batch_size: "8" 2023-10-25 12:55:17,374 - max_epochs: "10" 2023-10-25 12:55:17,374 - shuffle: "True" 2023-10-25 12:55:17,374 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:55:17,374 Plugins: 2023-10-25 12:55:17,374 - TensorboardLogger 2023-10-25 12:55:17,374 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 12:55:17,374 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:55:17,374 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 12:55:17,374 - metric: "('micro avg', 'f1-score')" 2023-10-25 12:55:17,374 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:55:17,374 Computation: 2023-10-25 12:55:17,374 - compute on device: cuda:0 2023-10-25 12:55:17,374 - embedding storage: none 2023-10-25 12:55:17,374 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:55:17,374 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-25 12:55:17,374 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:55:17,374 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:55:17,374 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 12:55:22,325 epoch 1 - iter 77/773 - loss 1.93757916 - time (sec): 4.95 - samples/sec: 2633.95 - lr: 0.000003 - momentum: 0.000000 2023-10-25 12:55:27,147 epoch 1 - iter 154/773 - loss 1.11916091 - time (sec): 9.77 - samples/sec: 2623.41 - lr: 0.000006 - momentum: 0.000000 2023-10-25 12:55:31,976 epoch 1 - iter 231/773 - loss 0.82756714 - time (sec): 14.60 - samples/sec: 2575.99 - lr: 0.000009 - momentum: 0.000000 2023-10-25 12:55:36,821 epoch 1 - iter 308/773 - loss 0.66515291 - time (sec): 19.45 - samples/sec: 2548.10 - lr: 0.000012 - momentum: 0.000000 2023-10-25 12:55:41,560 epoch 1 - iter 385/773 - loss 0.56018412 - time (sec): 24.18 - samples/sec: 2536.59 - lr: 0.000015 - momentum: 0.000000 2023-10-25 12:55:46,415 epoch 1 - iter 462/773 - loss 0.48843856 - time (sec): 29.04 - samples/sec: 2539.21 - lr: 0.000018 - momentum: 0.000000 2023-10-25 12:55:51,206 epoch 1 - iter 539/773 - loss 0.43466720 - time (sec): 33.83 - samples/sec: 2528.25 - lr: 0.000021 - momentum: 0.000000 2023-10-25 12:55:56,079 epoch 1 - iter 616/773 - loss 0.39209967 - time (sec): 38.70 - samples/sec: 2530.25 - lr: 0.000024 - momentum: 0.000000 2023-10-25 12:56:00,886 epoch 1 - iter 693/773 - loss 0.35744237 - time (sec): 43.51 - samples/sec: 2545.35 - lr: 0.000027 - momentum: 0.000000 2023-10-25 12:56:05,722 epoch 1 - iter 770/773 - loss 0.32866683 - time (sec): 48.35 - samples/sec: 2563.64 - lr: 0.000030 - momentum: 0.000000 2023-10-25 12:56:05,903 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:56:05,903 EPOCH 1 done: loss 0.3279 - lr: 0.000030 2023-10-25 12:56:09,118 DEV : loss 0.04656985402107239 - f1-score (micro avg) 0.7598 2023-10-25 12:56:09,136 saving best model 2023-10-25 12:56:09,633 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:56:14,324 epoch 2 - iter 77/773 - loss 0.08563895 - time (sec): 4.69 - samples/sec: 2443.89 - lr: 0.000030 - momentum: 0.000000 2023-10-25 12:56:19,149 epoch 2 - iter 154/773 - loss 0.07537749 - time (sec): 9.51 - samples/sec: 2436.76 - lr: 0.000029 - momentum: 0.000000 2023-10-25 12:56:24,014 epoch 2 - iter 231/773 - loss 0.07532105 - time (sec): 14.38 - samples/sec: 2440.29 - lr: 0.000029 - momentum: 0.000000 2023-10-25 12:56:28,747 epoch 2 - iter 308/773 - loss 0.07738630 - time (sec): 19.11 - samples/sec: 2509.80 - lr: 0.000029 - momentum: 0.000000 2023-10-25 12:56:33,915 epoch 2 - iter 385/773 - loss 0.07528257 - time (sec): 24.28 - samples/sec: 2523.33 - lr: 0.000028 - momentum: 0.000000 2023-10-25 12:56:38,709 epoch 2 - iter 462/773 - loss 0.07420890 - time (sec): 29.07 - samples/sec: 2528.91 - lr: 0.000028 - momentum: 0.000000 2023-10-25 12:56:43,381 epoch 2 - iter 539/773 - loss 0.07258413 - time (sec): 33.75 - samples/sec: 2581.48 - lr: 0.000028 - momentum: 0.000000 2023-10-25 12:56:48,044 epoch 2 - iter 616/773 - loss 0.07168966 - time (sec): 38.41 - samples/sec: 2585.65 - lr: 0.000027 - momentum: 0.000000 2023-10-25 12:56:52,727 epoch 2 - iter 693/773 - loss 0.07240517 - time (sec): 43.09 - samples/sec: 2583.59 - lr: 0.000027 - momentum: 0.000000 2023-10-25 12:56:57,388 epoch 2 - iter 770/773 - loss 0.07049818 - time (sec): 47.75 - samples/sec: 2593.56 - lr: 0.000027 - momentum: 0.000000 2023-10-25 12:56:57,548 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:56:57,548 EPOCH 2 done: loss 0.0705 - lr: 0.000027 2023-10-25 12:57:00,100 DEV : loss 0.05241599678993225 - f1-score (micro avg) 0.766 2023-10-25 12:57:00,121 saving best model 2023-10-25 12:57:00,787 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:57:05,053 epoch 3 - iter 77/773 - loss 0.03937428 - time (sec): 4.26 - samples/sec: 2830.87 - lr: 0.000026 - momentum: 0.000000 2023-10-25 12:57:10,111 epoch 3 - iter 154/773 - loss 0.03820952 - time (sec): 9.32 - samples/sec: 2597.71 - lr: 0.000026 - momentum: 0.000000 2023-10-25 12:57:14,565 epoch 3 - iter 231/773 - loss 0.04003269 - time (sec): 13.77 - samples/sec: 2757.09 - lr: 0.000026 - momentum: 0.000000 2023-10-25 12:57:18,915 epoch 3 - iter 308/773 - loss 0.04207363 - time (sec): 18.12 - samples/sec: 2736.31 - lr: 0.000025 - momentum: 0.000000 2023-10-25 12:57:23,359 epoch 3 - iter 385/773 - loss 0.04520442 - time (sec): 22.57 - samples/sec: 2747.11 - lr: 0.000025 - momentum: 0.000000 2023-10-25 12:57:27,600 epoch 3 - iter 462/773 - loss 0.04505890 - time (sec): 26.81 - samples/sec: 2750.23 - lr: 0.000025 - momentum: 0.000000 2023-10-25 12:57:31,826 epoch 3 - iter 539/773 - loss 0.04438068 - time (sec): 31.04 - samples/sec: 2778.92 - lr: 0.000024 - momentum: 0.000000 2023-10-25 12:57:36,080 epoch 3 - iter 616/773 - loss 0.04412004 - time (sec): 35.29 - samples/sec: 2803.10 - lr: 0.000024 - momentum: 0.000000 2023-10-25 12:57:40,438 epoch 3 - iter 693/773 - loss 0.04308429 - time (sec): 39.65 - samples/sec: 2803.93 - lr: 0.000024 - momentum: 0.000000 2023-10-25 12:57:44,861 epoch 3 - iter 770/773 - loss 0.04355937 - time (sec): 44.07 - samples/sec: 2811.96 - lr: 0.000023 - momentum: 0.000000 2023-10-25 12:57:45,041 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:57:45,042 EPOCH 3 done: loss 0.0435 - lr: 0.000023 2023-10-25 12:57:47,526 DEV : loss 0.07553908228874207 - f1-score (micro avg) 0.7581 2023-10-25 12:57:47,545 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:57:52,456 epoch 4 - iter 77/773 - loss 0.02787168 - time (sec): 4.91 - samples/sec: 2464.84 - lr: 0.000023 - momentum: 0.000000 2023-10-25 12:57:57,409 epoch 4 - iter 154/773 - loss 0.02466669 - time (sec): 9.86 - samples/sec: 2463.70 - lr: 0.000023 - momentum: 0.000000 2023-10-25 12:58:01,920 epoch 4 - iter 231/773 - loss 0.02719373 - time (sec): 14.37 - samples/sec: 2567.69 - lr: 0.000022 - momentum: 0.000000 2023-10-25 12:58:06,111 epoch 4 - iter 308/773 - loss 0.02590039 - time (sec): 18.56 - samples/sec: 2641.33 - lr: 0.000022 - momentum: 0.000000 2023-10-25 12:58:10,335 epoch 4 - iter 385/773 - loss 0.02730470 - time (sec): 22.79 - samples/sec: 2649.13 - lr: 0.000022 - momentum: 0.000000 2023-10-25 12:58:14,488 epoch 4 - iter 462/773 - loss 0.02758635 - time (sec): 26.94 - samples/sec: 2667.62 - lr: 0.000021 - momentum: 0.000000 2023-10-25 12:58:19,096 epoch 4 - iter 539/773 - loss 0.02908079 - time (sec): 31.55 - samples/sec: 2686.53 - lr: 0.000021 - momentum: 0.000000 2023-10-25 12:58:23,572 epoch 4 - iter 616/773 - loss 0.02974028 - time (sec): 36.03 - samples/sec: 2711.79 - lr: 0.000021 - momentum: 0.000000 2023-10-25 12:58:27,965 epoch 4 - iter 693/773 - loss 0.02923596 - time (sec): 40.42 - samples/sec: 2751.85 - lr: 0.000020 - momentum: 0.000000 2023-10-25 12:58:32,489 epoch 4 - iter 770/773 - loss 0.02839604 - time (sec): 44.94 - samples/sec: 2757.19 - lr: 0.000020 - momentum: 0.000000 2023-10-25 12:58:32,667 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:58:32,667 EPOCH 4 done: loss 0.0285 - lr: 0.000020 2023-10-25 12:58:35,305 DEV : loss 0.10123448073863983 - f1-score (micro avg) 0.7588 2023-10-25 12:58:35,322 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:58:40,130 epoch 5 - iter 77/773 - loss 0.01818218 - time (sec): 4.81 - samples/sec: 2724.06 - lr: 0.000020 - momentum: 0.000000 2023-10-25 12:58:44,779 epoch 5 - iter 154/773 - loss 0.01934967 - time (sec): 9.45 - samples/sec: 2642.61 - lr: 0.000019 - momentum: 0.000000 2023-10-25 12:58:49,349 epoch 5 - iter 231/773 - loss 0.01860970 - time (sec): 14.02 - samples/sec: 2622.37 - lr: 0.000019 - momentum: 0.000000 2023-10-25 12:58:53,878 epoch 5 - iter 308/773 - loss 0.01852623 - time (sec): 18.55 - samples/sec: 2604.00 - lr: 0.000019 - momentum: 0.000000 2023-10-25 12:58:58,223 epoch 5 - iter 385/773 - loss 0.02029059 - time (sec): 22.90 - samples/sec: 2677.70 - lr: 0.000018 - momentum: 0.000000 2023-10-25 12:59:02,519 epoch 5 - iter 462/773 - loss 0.01982896 - time (sec): 27.19 - samples/sec: 2678.43 - lr: 0.000018 - momentum: 0.000000 2023-10-25 12:59:06,948 epoch 5 - iter 539/773 - loss 0.01903206 - time (sec): 31.62 - samples/sec: 2682.60 - lr: 0.000018 - momentum: 0.000000 2023-10-25 12:59:11,469 epoch 5 - iter 616/773 - loss 0.01983065 - time (sec): 36.14 - samples/sec: 2715.33 - lr: 0.000017 - momentum: 0.000000 2023-10-25 12:59:15,959 epoch 5 - iter 693/773 - loss 0.01906685 - time (sec): 40.63 - samples/sec: 2734.27 - lr: 0.000017 - momentum: 0.000000 2023-10-25 12:59:20,380 epoch 5 - iter 770/773 - loss 0.01928792 - time (sec): 45.06 - samples/sec: 2749.87 - lr: 0.000017 - momentum: 0.000000 2023-10-25 12:59:20,549 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:59:20,549 EPOCH 5 done: loss 0.0193 - lr: 0.000017 2023-10-25 12:59:23,132 DEV : loss 0.09339083731174469 - f1-score (micro avg) 0.7702 2023-10-25 12:59:23,152 saving best model 2023-10-25 12:59:23,843 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:59:28,319 epoch 6 - iter 77/773 - loss 0.01168986 - time (sec): 4.47 - samples/sec: 2758.77 - lr: 0.000016 - momentum: 0.000000 2023-10-25 12:59:32,805 epoch 6 - iter 154/773 - loss 0.01402440 - time (sec): 8.96 - samples/sec: 2807.18 - lr: 0.000016 - momentum: 0.000000 2023-10-25 12:59:37,192 epoch 6 - iter 231/773 - loss 0.01359986 - time (sec): 13.35 - samples/sec: 2773.09 - lr: 0.000016 - momentum: 0.000000 2023-10-25 12:59:41,538 epoch 6 - iter 308/773 - loss 0.01576820 - time (sec): 17.69 - samples/sec: 2806.65 - lr: 0.000015 - momentum: 0.000000 2023-10-25 12:59:45,952 epoch 6 - iter 385/773 - loss 0.01553924 - time (sec): 22.11 - samples/sec: 2811.99 - lr: 0.000015 - momentum: 0.000000 2023-10-25 12:59:50,240 epoch 6 - iter 462/773 - loss 0.01476289 - time (sec): 26.40 - samples/sec: 2838.14 - lr: 0.000015 - momentum: 0.000000 2023-10-25 12:59:54,601 epoch 6 - iter 539/773 - loss 0.01422328 - time (sec): 30.76 - samples/sec: 2834.90 - lr: 0.000014 - momentum: 0.000000 2023-10-25 12:59:58,868 epoch 6 - iter 616/773 - loss 0.01396188 - time (sec): 35.02 - samples/sec: 2835.21 - lr: 0.000014 - momentum: 0.000000 2023-10-25 13:00:04,003 epoch 6 - iter 693/773 - loss 0.01434232 - time (sec): 40.16 - samples/sec: 2781.23 - lr: 0.000014 - momentum: 0.000000 2023-10-25 13:00:08,377 epoch 6 - iter 770/773 - loss 0.01390229 - time (sec): 44.53 - samples/sec: 2782.03 - lr: 0.000013 - momentum: 0.000000 2023-10-25 13:00:08,548 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:00:08,548 EPOCH 6 done: loss 0.0139 - lr: 0.000013 2023-10-25 13:00:11,523 DEV : loss 0.10927439481019974 - f1-score (micro avg) 0.7676 2023-10-25 13:00:11,545 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:00:16,317 epoch 7 - iter 77/773 - loss 0.00916644 - time (sec): 4.77 - samples/sec: 2721.74 - lr: 0.000013 - momentum: 0.000000 2023-10-25 13:00:21,088 epoch 7 - iter 154/773 - loss 0.00780367 - time (sec): 9.54 - samples/sec: 2623.17 - lr: 0.000013 - momentum: 0.000000 2023-10-25 13:00:26,000 epoch 7 - iter 231/773 - loss 0.00824754 - time (sec): 14.45 - samples/sec: 2616.04 - lr: 0.000012 - momentum: 0.000000 2023-10-25 13:00:30,653 epoch 7 - iter 308/773 - loss 0.00848303 - time (sec): 19.11 - samples/sec: 2639.15 - lr: 0.000012 - momentum: 0.000000 2023-10-25 13:00:35,271 epoch 7 - iter 385/773 - loss 0.00832052 - time (sec): 23.72 - samples/sec: 2617.13 - lr: 0.000012 - momentum: 0.000000 2023-10-25 13:00:39,832 epoch 7 - iter 462/773 - loss 0.00885479 - time (sec): 28.29 - samples/sec: 2634.34 - lr: 0.000011 - momentum: 0.000000 2023-10-25 13:00:44,324 epoch 7 - iter 539/773 - loss 0.00964508 - time (sec): 32.78 - samples/sec: 2650.42 - lr: 0.000011 - momentum: 0.000000 2023-10-25 13:00:49,078 epoch 7 - iter 616/773 - loss 0.00959482 - time (sec): 37.53 - samples/sec: 2638.35 - lr: 0.000011 - momentum: 0.000000 2023-10-25 13:00:53,873 epoch 7 - iter 693/773 - loss 0.00983066 - time (sec): 42.33 - samples/sec: 2657.55 - lr: 0.000010 - momentum: 0.000000 2023-10-25 13:00:58,356 epoch 7 - iter 770/773 - loss 0.00990440 - time (sec): 46.81 - samples/sec: 2648.46 - lr: 0.000010 - momentum: 0.000000 2023-10-25 13:00:58,529 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:00:58,529 EPOCH 7 done: loss 0.0099 - lr: 0.000010 2023-10-25 13:01:01,014 DEV : loss 0.11133752763271332 - f1-score (micro avg) 0.7647 2023-10-25 13:01:01,031 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:01:05,488 epoch 8 - iter 77/773 - loss 0.00796583 - time (sec): 4.46 - samples/sec: 2791.97 - lr: 0.000010 - momentum: 0.000000 2023-10-25 13:01:09,862 epoch 8 - iter 154/773 - loss 0.00803247 - time (sec): 8.83 - samples/sec: 2753.71 - lr: 0.000009 - momentum: 0.000000 2023-10-25 13:01:14,395 epoch 8 - iter 231/773 - loss 0.00591916 - time (sec): 13.36 - samples/sec: 2744.01 - lr: 0.000009 - momentum: 0.000000 2023-10-25 13:01:18,853 epoch 8 - iter 308/773 - loss 0.00604710 - time (sec): 17.82 - samples/sec: 2757.60 - lr: 0.000009 - momentum: 0.000000 2023-10-25 13:01:23,366 epoch 8 - iter 385/773 - loss 0.00748894 - time (sec): 22.33 - samples/sec: 2758.83 - lr: 0.000008 - momentum: 0.000000 2023-10-25 13:01:27,828 epoch 8 - iter 462/773 - loss 0.00792268 - time (sec): 26.80 - samples/sec: 2747.09 - lr: 0.000008 - momentum: 0.000000 2023-10-25 13:01:32,471 epoch 8 - iter 539/773 - loss 0.00729680 - time (sec): 31.44 - samples/sec: 2787.77 - lr: 0.000008 - momentum: 0.000000 2023-10-25 13:01:37,083 epoch 8 - iter 616/773 - loss 0.00697704 - time (sec): 36.05 - samples/sec: 2774.52 - lr: 0.000007 - momentum: 0.000000 2023-10-25 13:01:41,417 epoch 8 - iter 693/773 - loss 0.00685826 - time (sec): 40.38 - samples/sec: 2764.31 - lr: 0.000007 - momentum: 0.000000 2023-10-25 13:01:45,884 epoch 8 - iter 770/773 - loss 0.00645588 - time (sec): 44.85 - samples/sec: 2759.56 - lr: 0.000007 - momentum: 0.000000 2023-10-25 13:01:46,056 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:01:46,056 EPOCH 8 done: loss 0.0065 - lr: 0.000007 2023-10-25 13:01:48,732 DEV : loss 0.11849800497293472 - f1-score (micro avg) 0.7757 2023-10-25 13:01:48,749 saving best model 2023-10-25 13:01:49,470 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:01:53,926 epoch 9 - iter 77/773 - loss 0.00385767 - time (sec): 4.45 - samples/sec: 2952.48 - lr: 0.000006 - momentum: 0.000000 2023-10-25 13:01:58,418 epoch 9 - iter 154/773 - loss 0.00336774 - time (sec): 8.95 - samples/sec: 2784.84 - lr: 0.000006 - momentum: 0.000000 2023-10-25 13:02:02,980 epoch 9 - iter 231/773 - loss 0.00334359 - time (sec): 13.51 - samples/sec: 2796.57 - lr: 0.000006 - momentum: 0.000000 2023-10-25 13:02:07,317 epoch 9 - iter 308/773 - loss 0.00425217 - time (sec): 17.84 - samples/sec: 2778.14 - lr: 0.000005 - momentum: 0.000000 2023-10-25 13:02:11,731 epoch 9 - iter 385/773 - loss 0.00409554 - time (sec): 22.26 - samples/sec: 2785.82 - lr: 0.000005 - momentum: 0.000000 2023-10-25 13:02:16,194 epoch 9 - iter 462/773 - loss 0.00412034 - time (sec): 26.72 - samples/sec: 2794.35 - lr: 0.000005 - momentum: 0.000000 2023-10-25 13:02:20,762 epoch 9 - iter 539/773 - loss 0.00422574 - time (sec): 31.29 - samples/sec: 2794.67 - lr: 0.000004 - momentum: 0.000000 2023-10-25 13:02:25,322 epoch 9 - iter 616/773 - loss 0.00422062 - time (sec): 35.85 - samples/sec: 2797.81 - lr: 0.000004 - momentum: 0.000000 2023-10-25 13:02:29,701 epoch 9 - iter 693/773 - loss 0.00418772 - time (sec): 40.23 - samples/sec: 2793.58 - lr: 0.000004 - momentum: 0.000000 2023-10-25 13:02:33,944 epoch 9 - iter 770/773 - loss 0.00459974 - time (sec): 44.47 - samples/sec: 2787.88 - lr: 0.000003 - momentum: 0.000000 2023-10-25 13:02:34,101 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:02:34,101 EPOCH 9 done: loss 0.0046 - lr: 0.000003 2023-10-25 13:02:36,714 DEV : loss 0.12025374174118042 - f1-score (micro avg) 0.7708 2023-10-25 13:02:36,733 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:02:41,339 epoch 10 - iter 77/773 - loss 0.00335894 - time (sec): 4.60 - samples/sec: 2609.85 - lr: 0.000003 - momentum: 0.000000 2023-10-25 13:02:46,249 epoch 10 - iter 154/773 - loss 0.00503572 - time (sec): 9.51 - samples/sec: 2606.07 - lr: 0.000003 - momentum: 0.000000 2023-10-25 13:02:50,911 epoch 10 - iter 231/773 - loss 0.00379034 - time (sec): 14.18 - samples/sec: 2543.26 - lr: 0.000002 - momentum: 0.000000 2023-10-25 13:02:55,569 epoch 10 - iter 308/773 - loss 0.00329299 - time (sec): 18.83 - samples/sec: 2543.49 - lr: 0.000002 - momentum: 0.000000 2023-10-25 13:03:00,225 epoch 10 - iter 385/773 - loss 0.00325492 - time (sec): 23.49 - samples/sec: 2568.46 - lr: 0.000002 - momentum: 0.000000 2023-10-25 13:03:04,754 epoch 10 - iter 462/773 - loss 0.00345791 - time (sec): 28.02 - samples/sec: 2594.25 - lr: 0.000001 - momentum: 0.000000 2023-10-25 13:03:09,480 epoch 10 - iter 539/773 - loss 0.00316405 - time (sec): 32.75 - samples/sec: 2628.26 - lr: 0.000001 - momentum: 0.000000 2023-10-25 13:03:14,068 epoch 10 - iter 616/773 - loss 0.00348066 - time (sec): 37.33 - samples/sec: 2653.42 - lr: 0.000001 - momentum: 0.000000 2023-10-25 13:03:18,559 epoch 10 - iter 693/773 - loss 0.00321585 - time (sec): 41.83 - samples/sec: 2669.67 - lr: 0.000000 - momentum: 0.000000 2023-10-25 13:03:22,933 epoch 10 - iter 770/773 - loss 0.00290477 - time (sec): 46.20 - samples/sec: 2682.83 - lr: 0.000000 - momentum: 0.000000 2023-10-25 13:03:23,088 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:03:23,089 EPOCH 10 done: loss 0.0029 - lr: 0.000000 2023-10-25 13:03:26,447 DEV : loss 0.12326761335134506 - f1-score (micro avg) 0.7702 2023-10-25 13:03:26,922 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:03:26,923 Loading model from best epoch ... 2023-10-25 13:03:28,638 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-25 13:03:37,408 Results: - F-score (micro) 0.7792 - F-score (macro) 0.6761 - Accuracy 0.6601 By class: precision recall f1-score support LOC 0.8471 0.8256 0.8362 946 BUILDING 0.5414 0.4595 0.4971 185 STREET 0.6613 0.7321 0.6949 56 micro avg 0.7949 0.7641 0.7792 1187 macro avg 0.6833 0.6724 0.6761 1187 weighted avg 0.7907 0.7641 0.7767 1187 2023-10-25 13:03:37,408 ----------------------------------------------------------------------------------------------------