2023-10-25 10:07:45,243 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:07:45,244 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 10:07:45,244 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:07:45,244 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-25 10:07:45,245 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:07:45,245 Train: 6183 sentences 2023-10-25 10:07:45,245 (train_with_dev=False, train_with_test=False) 2023-10-25 10:07:45,245 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:07:45,245 Training Params: 2023-10-25 10:07:45,245 - learning_rate: "3e-05" 2023-10-25 10:07:45,245 - mini_batch_size: "4" 2023-10-25 10:07:45,245 - max_epochs: "10" 2023-10-25 10:07:45,245 - shuffle: "True" 2023-10-25 10:07:45,245 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:07:45,245 Plugins: 2023-10-25 10:07:45,245 - TensorboardLogger 2023-10-25 10:07:45,245 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 10:07:45,245 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:07:45,245 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 10:07:45,245 - metric: "('micro avg', 'f1-score')" 2023-10-25 10:07:45,245 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:07:45,245 Computation: 2023-10-25 10:07:45,245 - compute on device: cuda:0 2023-10-25 10:07:45,245 - embedding storage: none 2023-10-25 10:07:45,245 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:07:45,245 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-25 10:07:45,245 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:07:45,245 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:07:45,245 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 10:07:54,382 epoch 1 - iter 154/1546 - loss 1.95898349 - time (sec): 9.14 - samples/sec: 1382.04 - lr: 0.000003 - momentum: 0.000000 2023-10-25 10:08:03,947 epoch 1 - iter 308/1546 - loss 1.08649823 - time (sec): 18.70 - samples/sec: 1356.53 - lr: 0.000006 - momentum: 0.000000 2023-10-25 10:08:13,323 epoch 1 - iter 462/1546 - loss 0.77749102 - time (sec): 28.08 - samples/sec: 1339.78 - lr: 0.000009 - momentum: 0.000000 2023-10-25 10:08:22,401 epoch 1 - iter 616/1546 - loss 0.61557554 - time (sec): 37.15 - samples/sec: 1350.74 - lr: 0.000012 - momentum: 0.000000 2023-10-25 10:08:31,651 epoch 1 - iter 770/1546 - loss 0.52609909 - time (sec): 46.40 - samples/sec: 1327.24 - lr: 0.000015 - momentum: 0.000000 2023-10-25 10:08:41,081 epoch 1 - iter 924/1546 - loss 0.45827477 - time (sec): 55.83 - samples/sec: 1323.47 - lr: 0.000018 - momentum: 0.000000 2023-10-25 10:08:50,887 epoch 1 - iter 1078/1546 - loss 0.40834105 - time (sec): 65.64 - samples/sec: 1316.31 - lr: 0.000021 - momentum: 0.000000 2023-10-25 10:09:00,366 epoch 1 - iter 1232/1546 - loss 0.37098150 - time (sec): 75.12 - samples/sec: 1322.36 - lr: 0.000024 - momentum: 0.000000 2023-10-25 10:09:09,449 epoch 1 - iter 1386/1546 - loss 0.34265115 - time (sec): 84.20 - samples/sec: 1322.78 - lr: 0.000027 - momentum: 0.000000 2023-10-25 10:09:18,649 epoch 1 - iter 1540/1546 - loss 0.31711290 - time (sec): 93.40 - samples/sec: 1327.83 - lr: 0.000030 - momentum: 0.000000 2023-10-25 10:09:18,971 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:09:18,972 EPOCH 1 done: loss 0.3168 - lr: 0.000030 2023-10-25 10:09:23,096 DEV : loss 0.06953319162130356 - f1-score (micro avg) 0.728 2023-10-25 10:09:23,121 saving best model 2023-10-25 10:09:23,684 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:09:32,952 epoch 2 - iter 154/1546 - loss 0.08565728 - time (sec): 9.27 - samples/sec: 1332.61 - lr: 0.000030 - momentum: 0.000000 2023-10-25 10:09:41,220 epoch 2 - iter 308/1546 - loss 0.07829584 - time (sec): 17.53 - samples/sec: 1394.51 - lr: 0.000029 - momentum: 0.000000 2023-10-25 10:09:49,136 epoch 2 - iter 462/1546 - loss 0.08041971 - time (sec): 25.45 - samples/sec: 1449.04 - lr: 0.000029 - momentum: 0.000000 2023-10-25 10:09:57,343 epoch 2 - iter 616/1546 - loss 0.07988986 - time (sec): 33.66 - samples/sec: 1467.37 - lr: 0.000029 - momentum: 0.000000 2023-10-25 10:10:05,884 epoch 2 - iter 770/1546 - loss 0.07995253 - time (sec): 42.20 - samples/sec: 1457.42 - lr: 0.000028 - momentum: 0.000000 2023-10-25 10:10:14,384 epoch 2 - iter 924/1546 - loss 0.07950843 - time (sec): 50.70 - samples/sec: 1456.83 - lr: 0.000028 - momentum: 0.000000 2023-10-25 10:10:23,311 epoch 2 - iter 1078/1546 - loss 0.07844762 - time (sec): 59.62 - samples/sec: 1450.06 - lr: 0.000028 - momentum: 0.000000 2023-10-25 10:10:31,932 epoch 2 - iter 1232/1546 - loss 0.07929517 - time (sec): 68.25 - samples/sec: 1451.56 - lr: 0.000027 - momentum: 0.000000 2023-10-25 10:10:40,728 epoch 2 - iter 1386/1546 - loss 0.07977055 - time (sec): 77.04 - samples/sec: 1450.59 - lr: 0.000027 - momentum: 0.000000 2023-10-25 10:10:49,454 epoch 2 - iter 1540/1546 - loss 0.08051213 - time (sec): 85.77 - samples/sec: 1444.81 - lr: 0.000027 - momentum: 0.000000 2023-10-25 10:10:49,779 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:10:49,779 EPOCH 2 done: loss 0.0805 - lr: 0.000027 2023-10-25 10:10:52,439 DEV : loss 0.06576813757419586 - f1-score (micro avg) 0.7718 2023-10-25 10:10:52,455 saving best model 2023-10-25 10:10:53,216 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:11:01,862 epoch 3 - iter 154/1546 - loss 0.04411862 - time (sec): 8.64 - samples/sec: 1443.78 - lr: 0.000026 - momentum: 0.000000 2023-10-25 10:11:10,682 epoch 3 - iter 308/1546 - loss 0.04134666 - time (sec): 17.46 - samples/sec: 1398.66 - lr: 0.000026 - momentum: 0.000000 2023-10-25 10:11:19,067 epoch 3 - iter 462/1546 - loss 0.04200707 - time (sec): 25.85 - samples/sec: 1413.11 - lr: 0.000026 - momentum: 0.000000 2023-10-25 10:11:28,242 epoch 3 - iter 616/1546 - loss 0.04883651 - time (sec): 35.02 - samples/sec: 1394.56 - lr: 0.000025 - momentum: 0.000000 2023-10-25 10:11:37,518 epoch 3 - iter 770/1546 - loss 0.04944480 - time (sec): 44.30 - samples/sec: 1380.19 - lr: 0.000025 - momentum: 0.000000 2023-10-25 10:11:46,001 epoch 3 - iter 924/1546 - loss 0.04932366 - time (sec): 52.78 - samples/sec: 1403.01 - lr: 0.000025 - momentum: 0.000000 2023-10-25 10:11:54,294 epoch 3 - iter 1078/1546 - loss 0.05053819 - time (sec): 61.08 - samples/sec: 1417.76 - lr: 0.000024 - momentum: 0.000000 2023-10-25 10:12:02,472 epoch 3 - iter 1232/1546 - loss 0.05116452 - time (sec): 69.25 - samples/sec: 1428.39 - lr: 0.000024 - momentum: 0.000000 2023-10-25 10:12:10,945 epoch 3 - iter 1386/1546 - loss 0.05124381 - time (sec): 77.73 - samples/sec: 1429.55 - lr: 0.000024 - momentum: 0.000000 2023-10-25 10:12:19,189 epoch 3 - iter 1540/1546 - loss 0.05213069 - time (sec): 85.97 - samples/sec: 1438.07 - lr: 0.000023 - momentum: 0.000000 2023-10-25 10:12:19,499 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:12:19,499 EPOCH 3 done: loss 0.0521 - lr: 0.000023 2023-10-25 10:12:22,512 DEV : loss 0.08034052699804306 - f1-score (micro avg) 0.7676 2023-10-25 10:12:22,534 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:12:31,143 epoch 4 - iter 154/1546 - loss 0.03047819 - time (sec): 8.61 - samples/sec: 1494.64 - lr: 0.000023 - momentum: 0.000000 2023-10-25 10:12:39,417 epoch 4 - iter 308/1546 - loss 0.03210798 - time (sec): 16.88 - samples/sec: 1449.02 - lr: 0.000023 - momentum: 0.000000 2023-10-25 10:12:47,693 epoch 4 - iter 462/1546 - loss 0.03589036 - time (sec): 25.16 - samples/sec: 1492.13 - lr: 0.000022 - momentum: 0.000000 2023-10-25 10:12:56,033 epoch 4 - iter 616/1546 - loss 0.03542909 - time (sec): 33.50 - samples/sec: 1506.75 - lr: 0.000022 - momentum: 0.000000 2023-10-25 10:13:04,396 epoch 4 - iter 770/1546 - loss 0.03538510 - time (sec): 41.86 - samples/sec: 1487.02 - lr: 0.000022 - momentum: 0.000000 2023-10-25 10:13:12,795 epoch 4 - iter 924/1546 - loss 0.03504173 - time (sec): 50.26 - samples/sec: 1489.25 - lr: 0.000021 - momentum: 0.000000 2023-10-25 10:13:21,058 epoch 4 - iter 1078/1546 - loss 0.03479597 - time (sec): 58.52 - samples/sec: 1501.68 - lr: 0.000021 - momentum: 0.000000 2023-10-25 10:13:29,439 epoch 4 - iter 1232/1546 - loss 0.03494097 - time (sec): 66.90 - samples/sec: 1495.78 - lr: 0.000021 - momentum: 0.000000 2023-10-25 10:13:37,640 epoch 4 - iter 1386/1546 - loss 0.03530495 - time (sec): 75.10 - samples/sec: 1491.99 - lr: 0.000020 - momentum: 0.000000 2023-10-25 10:13:45,873 epoch 4 - iter 1540/1546 - loss 0.03579614 - time (sec): 83.34 - samples/sec: 1485.33 - lr: 0.000020 - momentum: 0.000000 2023-10-25 10:13:46,164 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:13:46,164 EPOCH 4 done: loss 0.0358 - lr: 0.000020 2023-10-25 10:13:49,325 DEV : loss 0.10077176988124847 - f1-score (micro avg) 0.7794 2023-10-25 10:13:49,343 saving best model 2023-10-25 10:13:50,397 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:13:58,433 epoch 5 - iter 154/1546 - loss 0.02159984 - time (sec): 8.03 - samples/sec: 1417.90 - lr: 0.000020 - momentum: 0.000000 2023-10-25 10:14:06,479 epoch 5 - iter 308/1546 - loss 0.01984275 - time (sec): 16.08 - samples/sec: 1531.20 - lr: 0.000019 - momentum: 0.000000 2023-10-25 10:14:14,358 epoch 5 - iter 462/1546 - loss 0.01943297 - time (sec): 23.96 - samples/sec: 1561.52 - lr: 0.000019 - momentum: 0.000000 2023-10-25 10:14:22,405 epoch 5 - iter 616/1546 - loss 0.02182829 - time (sec): 32.01 - samples/sec: 1557.29 - lr: 0.000019 - momentum: 0.000000 2023-10-25 10:14:30,348 epoch 5 - iter 770/1546 - loss 0.02422414 - time (sec): 39.95 - samples/sec: 1552.65 - lr: 0.000018 - momentum: 0.000000 2023-10-25 10:14:38,271 epoch 5 - iter 924/1546 - loss 0.02304780 - time (sec): 47.87 - samples/sec: 1558.65 - lr: 0.000018 - momentum: 0.000000 2023-10-25 10:14:46,226 epoch 5 - iter 1078/1546 - loss 0.02401278 - time (sec): 55.83 - samples/sec: 1554.19 - lr: 0.000018 - momentum: 0.000000 2023-10-25 10:14:54,088 epoch 5 - iter 1232/1546 - loss 0.02393514 - time (sec): 63.69 - samples/sec: 1556.48 - lr: 0.000017 - momentum: 0.000000 2023-10-25 10:15:02,055 epoch 5 - iter 1386/1546 - loss 0.02427300 - time (sec): 71.66 - samples/sec: 1557.22 - lr: 0.000017 - momentum: 0.000000 2023-10-25 10:15:09,948 epoch 5 - iter 1540/1546 - loss 0.02413449 - time (sec): 79.55 - samples/sec: 1557.93 - lr: 0.000017 - momentum: 0.000000 2023-10-25 10:15:10,234 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:15:10,234 EPOCH 5 done: loss 0.0241 - lr: 0.000017 2023-10-25 10:15:13,016 DEV : loss 0.10452549159526825 - f1-score (micro avg) 0.7705 2023-10-25 10:15:13,036 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:15:21,266 epoch 6 - iter 154/1546 - loss 0.01168533 - time (sec): 8.23 - samples/sec: 1528.49 - lr: 0.000016 - momentum: 0.000000 2023-10-25 10:15:29,267 epoch 6 - iter 308/1546 - loss 0.01065419 - time (sec): 16.23 - samples/sec: 1537.60 - lr: 0.000016 - momentum: 0.000000 2023-10-25 10:15:37,298 epoch 6 - iter 462/1546 - loss 0.01526262 - time (sec): 24.26 - samples/sec: 1541.69 - lr: 0.000016 - momentum: 0.000000 2023-10-25 10:15:45,424 epoch 6 - iter 616/1546 - loss 0.01612627 - time (sec): 32.39 - samples/sec: 1542.05 - lr: 0.000015 - momentum: 0.000000 2023-10-25 10:15:53,337 epoch 6 - iter 770/1546 - loss 0.01582392 - time (sec): 40.30 - samples/sec: 1513.88 - lr: 0.000015 - momentum: 0.000000 2023-10-25 10:16:01,223 epoch 6 - iter 924/1546 - loss 0.01644036 - time (sec): 48.19 - samples/sec: 1517.02 - lr: 0.000015 - momentum: 0.000000 2023-10-25 10:16:09,384 epoch 6 - iter 1078/1546 - loss 0.01548608 - time (sec): 56.35 - samples/sec: 1520.60 - lr: 0.000014 - momentum: 0.000000 2023-10-25 10:16:17,233 epoch 6 - iter 1232/1546 - loss 0.01600009 - time (sec): 64.20 - samples/sec: 1541.94 - lr: 0.000014 - momentum: 0.000000 2023-10-25 10:16:25,302 epoch 6 - iter 1386/1546 - loss 0.01604283 - time (sec): 72.26 - samples/sec: 1542.64 - lr: 0.000014 - momentum: 0.000000 2023-10-25 10:16:33,560 epoch 6 - iter 1540/1546 - loss 0.01611172 - time (sec): 80.52 - samples/sec: 1538.36 - lr: 0.000013 - momentum: 0.000000 2023-10-25 10:16:33,865 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:16:33,865 EPOCH 6 done: loss 0.0161 - lr: 0.000013 2023-10-25 10:16:37,133 DEV : loss 0.10936635732650757 - f1-score (micro avg) 0.7838 2023-10-25 10:16:37,151 saving best model 2023-10-25 10:16:37,850 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:16:46,325 epoch 7 - iter 154/1546 - loss 0.01008689 - time (sec): 8.47 - samples/sec: 1438.72 - lr: 0.000013 - momentum: 0.000000 2023-10-25 10:16:54,564 epoch 7 - iter 308/1546 - loss 0.01216195 - time (sec): 16.71 - samples/sec: 1492.79 - lr: 0.000013 - momentum: 0.000000 2023-10-25 10:17:02,857 epoch 7 - iter 462/1546 - loss 0.00994628 - time (sec): 25.00 - samples/sec: 1525.72 - lr: 0.000012 - momentum: 0.000000 2023-10-25 10:17:11,081 epoch 7 - iter 616/1546 - loss 0.01019350 - time (sec): 33.23 - samples/sec: 1513.71 - lr: 0.000012 - momentum: 0.000000 2023-10-25 10:17:19,490 epoch 7 - iter 770/1546 - loss 0.01005783 - time (sec): 41.64 - samples/sec: 1499.81 - lr: 0.000012 - momentum: 0.000000 2023-10-25 10:17:27,757 epoch 7 - iter 924/1546 - loss 0.01009497 - time (sec): 49.90 - samples/sec: 1482.68 - lr: 0.000011 - momentum: 0.000000 2023-10-25 10:17:36,242 epoch 7 - iter 1078/1546 - loss 0.01001395 - time (sec): 58.39 - samples/sec: 1488.54 - lr: 0.000011 - momentum: 0.000000 2023-10-25 10:17:44,675 epoch 7 - iter 1232/1546 - loss 0.00998899 - time (sec): 66.82 - samples/sec: 1488.34 - lr: 0.000011 - momentum: 0.000000 2023-10-25 10:17:53,200 epoch 7 - iter 1386/1546 - loss 0.01038485 - time (sec): 75.35 - samples/sec: 1479.83 - lr: 0.000010 - momentum: 0.000000 2023-10-25 10:18:01,461 epoch 7 - iter 1540/1546 - loss 0.01017453 - time (sec): 83.61 - samples/sec: 1478.58 - lr: 0.000010 - momentum: 0.000000 2023-10-25 10:18:01,791 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:18:01,791 EPOCH 7 done: loss 0.0102 - lr: 0.000010 2023-10-25 10:18:05,007 DEV : loss 0.1250709891319275 - f1-score (micro avg) 0.7574 2023-10-25 10:18:05,026 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:18:13,181 epoch 8 - iter 154/1546 - loss 0.00665759 - time (sec): 8.15 - samples/sec: 1538.14 - lr: 0.000010 - momentum: 0.000000 2023-10-25 10:18:21,404 epoch 8 - iter 308/1546 - loss 0.00574229 - time (sec): 16.38 - samples/sec: 1574.33 - lr: 0.000009 - momentum: 0.000000 2023-10-25 10:18:29,281 epoch 8 - iter 462/1546 - loss 0.00845782 - time (sec): 24.25 - samples/sec: 1557.05 - lr: 0.000009 - momentum: 0.000000 2023-10-25 10:18:37,411 epoch 8 - iter 616/1546 - loss 0.00893510 - time (sec): 32.38 - samples/sec: 1529.51 - lr: 0.000009 - momentum: 0.000000 2023-10-25 10:18:45,240 epoch 8 - iter 770/1546 - loss 0.00897181 - time (sec): 40.21 - samples/sec: 1523.81 - lr: 0.000008 - momentum: 0.000000 2023-10-25 10:18:53,140 epoch 8 - iter 924/1546 - loss 0.00912196 - time (sec): 48.11 - samples/sec: 1520.34 - lr: 0.000008 - momentum: 0.000000 2023-10-25 10:19:01,009 epoch 8 - iter 1078/1546 - loss 0.00899284 - time (sec): 55.98 - samples/sec: 1517.08 - lr: 0.000008 - momentum: 0.000000 2023-10-25 10:19:09,092 epoch 8 - iter 1232/1546 - loss 0.00792079 - time (sec): 64.06 - samples/sec: 1537.54 - lr: 0.000007 - momentum: 0.000000 2023-10-25 10:19:17,137 epoch 8 - iter 1386/1546 - loss 0.00751409 - time (sec): 72.11 - samples/sec: 1545.03 - lr: 0.000007 - momentum: 0.000000 2023-10-25 10:19:25,122 epoch 8 - iter 1540/1546 - loss 0.00765103 - time (sec): 80.09 - samples/sec: 1545.22 - lr: 0.000007 - momentum: 0.000000 2023-10-25 10:19:25,437 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:19:25,438 EPOCH 8 done: loss 0.0076 - lr: 0.000007 2023-10-25 10:19:28,416 DEV : loss 0.138199582695961 - f1-score (micro avg) 0.7826 2023-10-25 10:19:28,435 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:19:36,386 epoch 9 - iter 154/1546 - loss 0.00388127 - time (sec): 7.95 - samples/sec: 1454.52 - lr: 0.000006 - momentum: 0.000000 2023-10-25 10:19:44,184 epoch 9 - iter 308/1546 - loss 0.00445294 - time (sec): 15.75 - samples/sec: 1513.85 - lr: 0.000006 - momentum: 0.000000 2023-10-25 10:19:52,139 epoch 9 - iter 462/1546 - loss 0.00384866 - time (sec): 23.70 - samples/sec: 1546.23 - lr: 0.000006 - momentum: 0.000000 2023-10-25 10:19:59,987 epoch 9 - iter 616/1546 - loss 0.00454109 - time (sec): 31.55 - samples/sec: 1565.36 - lr: 0.000005 - momentum: 0.000000 2023-10-25 10:20:08,351 epoch 9 - iter 770/1546 - loss 0.00384940 - time (sec): 39.91 - samples/sec: 1567.95 - lr: 0.000005 - momentum: 0.000000 2023-10-25 10:20:16,436 epoch 9 - iter 924/1546 - loss 0.00331364 - time (sec): 48.00 - samples/sec: 1565.11 - lr: 0.000005 - momentum: 0.000000 2023-10-25 10:20:24,623 epoch 9 - iter 1078/1546 - loss 0.00371152 - time (sec): 56.19 - samples/sec: 1570.81 - lr: 0.000004 - momentum: 0.000000 2023-10-25 10:20:32,735 epoch 9 - iter 1232/1546 - loss 0.00348174 - time (sec): 64.30 - samples/sec: 1561.45 - lr: 0.000004 - momentum: 0.000000 2023-10-25 10:20:40,421 epoch 9 - iter 1386/1546 - loss 0.00368389 - time (sec): 71.98 - samples/sec: 1552.80 - lr: 0.000004 - momentum: 0.000000 2023-10-25 10:20:48,214 epoch 9 - iter 1540/1546 - loss 0.00399989 - time (sec): 79.78 - samples/sec: 1551.76 - lr: 0.000003 - momentum: 0.000000 2023-10-25 10:20:48,507 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:20:48,507 EPOCH 9 done: loss 0.0040 - lr: 0.000003 2023-10-25 10:20:51,401 DEV : loss 0.14553460478782654 - f1-score (micro avg) 0.7775 2023-10-25 10:20:51,418 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:20:59,196 epoch 10 - iter 154/1546 - loss 0.00354279 - time (sec): 7.78 - samples/sec: 1585.87 - lr: 0.000003 - momentum: 0.000000 2023-10-25 10:21:07,094 epoch 10 - iter 308/1546 - loss 0.00332450 - time (sec): 15.67 - samples/sec: 1497.57 - lr: 0.000003 - momentum: 0.000000 2023-10-25 10:21:15,194 epoch 10 - iter 462/1546 - loss 0.00258446 - time (sec): 23.77 - samples/sec: 1505.04 - lr: 0.000002 - momentum: 0.000000 2023-10-25 10:21:23,161 epoch 10 - iter 616/1546 - loss 0.00309299 - time (sec): 31.74 - samples/sec: 1515.18 - lr: 0.000002 - momentum: 0.000000 2023-10-25 10:21:31,077 epoch 10 - iter 770/1546 - loss 0.00275583 - time (sec): 39.66 - samples/sec: 1538.77 - lr: 0.000002 - momentum: 0.000000 2023-10-25 10:21:39,083 epoch 10 - iter 924/1546 - loss 0.00295097 - time (sec): 47.66 - samples/sec: 1538.93 - lr: 0.000001 - momentum: 0.000000 2023-10-25 10:21:46,922 epoch 10 - iter 1078/1546 - loss 0.00274496 - time (sec): 55.50 - samples/sec: 1541.17 - lr: 0.000001 - momentum: 0.000000 2023-10-25 10:21:54,953 epoch 10 - iter 1232/1546 - loss 0.00293898 - time (sec): 63.53 - samples/sec: 1546.02 - lr: 0.000001 - momentum: 0.000000 2023-10-25 10:22:02,893 epoch 10 - iter 1386/1546 - loss 0.00260959 - time (sec): 71.47 - samples/sec: 1549.52 - lr: 0.000000 - momentum: 0.000000 2023-10-25 10:22:10,803 epoch 10 - iter 1540/1546 - loss 0.00281452 - time (sec): 79.38 - samples/sec: 1559.37 - lr: 0.000000 - momentum: 0.000000 2023-10-25 10:22:11,140 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:22:11,141 EPOCH 10 done: loss 0.0028 - lr: 0.000000 2023-10-25 10:22:14,123 DEV : loss 0.14905457198619843 - f1-score (micro avg) 0.7683 2023-10-25 10:22:14,660 ---------------------------------------------------------------------------------------------------- 2023-10-25 10:22:14,662 Loading model from best epoch ... 2023-10-25 10:22:16,789 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-25 10:22:26,927 Results: - F-score (micro) 0.7887 - F-score (macro) 0.6962 - Accuracy 0.6725 By class: precision recall f1-score support LOC 0.8172 0.8552 0.8357 946 BUILDING 0.5736 0.6108 0.5916 185 STREET 0.6029 0.7321 0.6613 56 micro avg 0.7673 0.8113 0.7887 1187 macro avg 0.6646 0.7327 0.6962 1187 weighted avg 0.7691 0.8113 0.7895 1187 2023-10-25 10:22:26,928 ----------------------------------------------------------------------------------------------------