2023-10-25 15:40:49,725 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:40:49,726 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 15:40:49,726 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:40:49,726 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-25 15:40:49,726 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:40:49,726 Train: 20847 sentences 2023-10-25 15:40:49,726 (train_with_dev=False, train_with_test=False) 2023-10-25 15:40:49,727 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:40:49,727 Training Params: 2023-10-25 15:40:49,727 - learning_rate: "5e-05" 2023-10-25 15:40:49,727 - mini_batch_size: "4" 2023-10-25 15:40:49,727 - max_epochs: "10" 2023-10-25 15:40:49,727 - shuffle: "True" 2023-10-25 15:40:49,727 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:40:49,727 Plugins: 2023-10-25 15:40:49,727 - TensorboardLogger 2023-10-25 15:40:49,727 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 15:40:49,727 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:40:49,727 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 15:40:49,727 - metric: "('micro avg', 'f1-score')" 2023-10-25 15:40:49,727 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:40:49,727 Computation: 2023-10-25 15:40:49,727 - compute on device: cuda:0 2023-10-25 15:40:49,727 - embedding storage: none 2023-10-25 15:40:49,727 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:40:49,727 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-25 15:40:49,727 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:40:49,727 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:40:49,727 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 15:41:12,430 epoch 1 - iter 521/5212 - loss 1.24847613 - time (sec): 22.70 - samples/sec: 1631.49 - lr: 0.000005 - momentum: 0.000000 2023-10-25 15:41:35,742 epoch 1 - iter 1042/5212 - loss 0.79409542 - time (sec): 46.01 - samples/sec: 1589.33 - lr: 0.000010 - momentum: 0.000000 2023-10-25 15:41:58,495 epoch 1 - iter 1563/5212 - loss 0.62000166 - time (sec): 68.77 - samples/sec: 1620.44 - lr: 0.000015 - momentum: 0.000000 2023-10-25 15:42:21,224 epoch 1 - iter 2084/5212 - loss 0.52313149 - time (sec): 91.50 - samples/sec: 1620.63 - lr: 0.000020 - momentum: 0.000000 2023-10-25 15:42:45,174 epoch 1 - iter 2605/5212 - loss 0.46787427 - time (sec): 115.45 - samples/sec: 1636.39 - lr: 0.000025 - momentum: 0.000000 2023-10-25 15:43:07,482 epoch 1 - iter 3126/5212 - loss 0.42636894 - time (sec): 137.75 - samples/sec: 1634.90 - lr: 0.000030 - momentum: 0.000000 2023-10-25 15:43:29,865 epoch 1 - iter 3647/5212 - loss 0.39828423 - time (sec): 160.14 - samples/sec: 1640.03 - lr: 0.000035 - momentum: 0.000000 2023-10-25 15:43:52,495 epoch 1 - iter 4168/5212 - loss 0.37970166 - time (sec): 182.77 - samples/sec: 1635.63 - lr: 0.000040 - momentum: 0.000000 2023-10-25 15:44:15,495 epoch 1 - iter 4689/5212 - loss 0.37361807 - time (sec): 205.77 - samples/sec: 1614.91 - lr: 0.000045 - momentum: 0.000000 2023-10-25 15:44:37,828 epoch 1 - iter 5210/5212 - loss 0.36166651 - time (sec): 228.10 - samples/sec: 1610.66 - lr: 0.000050 - momentum: 0.000000 2023-10-25 15:44:37,911 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:44:37,911 EPOCH 1 done: loss 0.3617 - lr: 0.000050 2023-10-25 15:44:41,574 DEV : loss 0.22080442309379578 - f1-score (micro avg) 0.1438 2023-10-25 15:44:41,599 saving best model 2023-10-25 15:44:42,080 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:45:05,344 epoch 2 - iter 521/5212 - loss 0.27701476 - time (sec): 23.26 - samples/sec: 1639.01 - lr: 0.000049 - momentum: 0.000000 2023-10-25 15:45:28,141 epoch 2 - iter 1042/5212 - loss 0.26774174 - time (sec): 46.06 - samples/sec: 1663.32 - lr: 0.000049 - momentum: 0.000000 2023-10-25 15:45:50,898 epoch 2 - iter 1563/5212 - loss 0.25315314 - time (sec): 68.82 - samples/sec: 1650.18 - lr: 0.000048 - momentum: 0.000000 2023-10-25 15:46:14,163 epoch 2 - iter 2084/5212 - loss 0.29757522 - time (sec): 92.08 - samples/sec: 1628.91 - lr: 0.000048 - momentum: 0.000000 2023-10-25 15:46:36,427 epoch 2 - iter 2605/5212 - loss 0.32230692 - time (sec): 114.35 - samples/sec: 1631.59 - lr: 0.000047 - momentum: 0.000000 2023-10-25 15:46:58,692 epoch 2 - iter 3126/5212 - loss 0.31370599 - time (sec): 136.61 - samples/sec: 1625.37 - lr: 0.000047 - momentum: 0.000000 2023-10-25 15:47:21,123 epoch 2 - iter 3647/5212 - loss 0.30764574 - time (sec): 159.04 - samples/sec: 1631.06 - lr: 0.000046 - momentum: 0.000000 2023-10-25 15:47:43,308 epoch 2 - iter 4168/5212 - loss 0.29897371 - time (sec): 181.23 - samples/sec: 1637.77 - lr: 0.000046 - momentum: 0.000000 2023-10-25 15:48:05,566 epoch 2 - iter 4689/5212 - loss 0.29460317 - time (sec): 203.48 - samples/sec: 1623.40 - lr: 0.000045 - momentum: 0.000000 2023-10-25 15:48:27,906 epoch 2 - iter 5210/5212 - loss 0.28663517 - time (sec): 225.83 - samples/sec: 1626.67 - lr: 0.000044 - momentum: 0.000000 2023-10-25 15:48:27,993 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:48:27,993 EPOCH 2 done: loss 0.2867 - lr: 0.000044 2023-10-25 15:48:35,139 DEV : loss 0.15546594560146332 - f1-score (micro avg) 0.2256 2023-10-25 15:48:35,165 saving best model 2023-10-25 15:48:35,768 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:48:57,895 epoch 3 - iter 521/5212 - loss 0.30169616 - time (sec): 22.12 - samples/sec: 1661.09 - lr: 0.000044 - momentum: 0.000000 2023-10-25 15:49:20,057 epoch 3 - iter 1042/5212 - loss 0.30759449 - time (sec): 44.29 - samples/sec: 1617.71 - lr: 0.000043 - momentum: 0.000000 2023-10-25 15:49:42,707 epoch 3 - iter 1563/5212 - loss 0.27595502 - time (sec): 66.94 - samples/sec: 1612.63 - lr: 0.000043 - momentum: 0.000000 2023-10-25 15:50:05,056 epoch 3 - iter 2084/5212 - loss 0.24782471 - time (sec): 89.29 - samples/sec: 1632.91 - lr: 0.000042 - momentum: 0.000000 2023-10-25 15:50:27,628 epoch 3 - iter 2605/5212 - loss 0.23028822 - time (sec): 111.86 - samples/sec: 1639.48 - lr: 0.000042 - momentum: 0.000000 2023-10-25 15:50:48,988 epoch 3 - iter 3126/5212 - loss 0.22726832 - time (sec): 133.22 - samples/sec: 1652.52 - lr: 0.000041 - momentum: 0.000000 2023-10-25 15:51:12,671 epoch 3 - iter 3647/5212 - loss 0.22179543 - time (sec): 156.90 - samples/sec: 1629.53 - lr: 0.000041 - momentum: 0.000000 2023-10-25 15:51:34,997 epoch 3 - iter 4168/5212 - loss 0.21682299 - time (sec): 179.23 - samples/sec: 1633.25 - lr: 0.000040 - momentum: 0.000000 2023-10-25 15:51:57,314 epoch 3 - iter 4689/5212 - loss 0.21234182 - time (sec): 201.54 - samples/sec: 1625.65 - lr: 0.000039 - momentum: 0.000000 2023-10-25 15:52:19,667 epoch 3 - iter 5210/5212 - loss 0.20687427 - time (sec): 223.90 - samples/sec: 1640.40 - lr: 0.000039 - momentum: 0.000000 2023-10-25 15:52:19,759 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:52:19,759 EPOCH 3 done: loss 0.2068 - lr: 0.000039 2023-10-25 15:52:26,616 DEV : loss 0.24288956820964813 - f1-score (micro avg) 0.2767 2023-10-25 15:52:26,640 saving best model 2023-10-25 15:52:27,244 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:52:49,976 epoch 4 - iter 521/5212 - loss 0.14541617 - time (sec): 22.73 - samples/sec: 1594.73 - lr: 0.000038 - momentum: 0.000000 2023-10-25 15:53:12,334 epoch 4 - iter 1042/5212 - loss 0.14059672 - time (sec): 45.09 - samples/sec: 1612.24 - lr: 0.000038 - momentum: 0.000000 2023-10-25 15:53:34,746 epoch 4 - iter 1563/5212 - loss 0.14336781 - time (sec): 67.50 - samples/sec: 1619.62 - lr: 0.000037 - momentum: 0.000000 2023-10-25 15:53:56,360 epoch 4 - iter 2084/5212 - loss 0.15113922 - time (sec): 89.11 - samples/sec: 1613.43 - lr: 0.000037 - momentum: 0.000000 2023-10-25 15:54:18,268 epoch 4 - iter 2605/5212 - loss 0.15623749 - time (sec): 111.02 - samples/sec: 1607.30 - lr: 0.000036 - momentum: 0.000000 2023-10-25 15:54:40,720 epoch 4 - iter 3126/5212 - loss 0.15743031 - time (sec): 133.47 - samples/sec: 1639.68 - lr: 0.000036 - momentum: 0.000000 2023-10-25 15:55:02,499 epoch 4 - iter 3647/5212 - loss 0.15737676 - time (sec): 155.25 - samples/sec: 1630.29 - lr: 0.000035 - momentum: 0.000000 2023-10-25 15:55:24,814 epoch 4 - iter 4168/5212 - loss 0.16195469 - time (sec): 177.57 - samples/sec: 1635.41 - lr: 0.000034 - momentum: 0.000000 2023-10-25 15:55:46,893 epoch 4 - iter 4689/5212 - loss 0.16745988 - time (sec): 199.65 - samples/sec: 1638.43 - lr: 0.000034 - momentum: 0.000000 2023-10-25 15:56:09,860 epoch 4 - iter 5210/5212 - loss 0.16610419 - time (sec): 222.61 - samples/sec: 1649.17 - lr: 0.000033 - momentum: 0.000000 2023-10-25 15:56:09,961 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:56:09,961 EPOCH 4 done: loss 0.1662 - lr: 0.000033 2023-10-25 15:56:16,861 DEV : loss 0.23563335835933685 - f1-score (micro avg) 0.3339 2023-10-25 15:56:16,886 saving best model 2023-10-25 15:56:17,503 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:56:40,112 epoch 5 - iter 521/5212 - loss 0.16272851 - time (sec): 22.60 - samples/sec: 1648.41 - lr: 0.000033 - momentum: 0.000000 2023-10-25 15:57:02,345 epoch 5 - iter 1042/5212 - loss 0.15512717 - time (sec): 44.84 - samples/sec: 1621.45 - lr: 0.000032 - momentum: 0.000000 2023-10-25 15:57:24,778 epoch 5 - iter 1563/5212 - loss 0.15269704 - time (sec): 67.27 - samples/sec: 1649.97 - lr: 0.000032 - momentum: 0.000000 2023-10-25 15:57:46,753 epoch 5 - iter 2084/5212 - loss 0.16050403 - time (sec): 89.25 - samples/sec: 1661.90 - lr: 0.000031 - momentum: 0.000000 2023-10-25 15:58:08,939 epoch 5 - iter 2605/5212 - loss 0.16890836 - time (sec): 111.43 - samples/sec: 1645.12 - lr: 0.000031 - momentum: 0.000000 2023-10-25 15:58:30,542 epoch 5 - iter 3126/5212 - loss 0.16716994 - time (sec): 133.03 - samples/sec: 1650.66 - lr: 0.000030 - momentum: 0.000000 2023-10-25 15:58:52,733 epoch 5 - iter 3647/5212 - loss 0.16808642 - time (sec): 155.23 - samples/sec: 1651.60 - lr: 0.000029 - momentum: 0.000000 2023-10-25 15:59:14,632 epoch 5 - iter 4168/5212 - loss 0.16952321 - time (sec): 177.12 - samples/sec: 1661.77 - lr: 0.000029 - momentum: 0.000000 2023-10-25 15:59:36,952 epoch 5 - iter 4689/5212 - loss 0.17072890 - time (sec): 199.44 - samples/sec: 1670.87 - lr: 0.000028 - momentum: 0.000000 2023-10-25 15:59:58,912 epoch 5 - iter 5210/5212 - loss 0.17071298 - time (sec): 221.40 - samples/sec: 1659.36 - lr: 0.000028 - momentum: 0.000000 2023-10-25 15:59:58,988 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:59:58,989 EPOCH 5 done: loss 0.1707 - lr: 0.000028 2023-10-25 16:00:05,971 DEV : loss 0.2225915640592575 - f1-score (micro avg) 0.2691 2023-10-25 16:00:05,997 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:00:28,448 epoch 6 - iter 521/5212 - loss 0.14161427 - time (sec): 22.45 - samples/sec: 1756.92 - lr: 0.000027 - momentum: 0.000000 2023-10-25 16:00:50,604 epoch 6 - iter 1042/5212 - loss 0.13835517 - time (sec): 44.61 - samples/sec: 1765.39 - lr: 0.000027 - momentum: 0.000000 2023-10-25 16:01:12,353 epoch 6 - iter 1563/5212 - loss 0.14662044 - time (sec): 66.35 - samples/sec: 1706.62 - lr: 0.000026 - momentum: 0.000000 2023-10-25 16:01:33,931 epoch 6 - iter 2084/5212 - loss 0.16265685 - time (sec): 87.93 - samples/sec: 1700.00 - lr: 0.000026 - momentum: 0.000000 2023-10-25 16:01:55,994 epoch 6 - iter 2605/5212 - loss 0.16371044 - time (sec): 110.00 - samples/sec: 1661.81 - lr: 0.000025 - momentum: 0.000000 2023-10-25 16:02:17,993 epoch 6 - iter 3126/5212 - loss 0.16618079 - time (sec): 131.99 - samples/sec: 1661.11 - lr: 0.000024 - momentum: 0.000000 2023-10-25 16:02:40,063 epoch 6 - iter 3647/5212 - loss 0.16564198 - time (sec): 154.06 - samples/sec: 1666.52 - lr: 0.000024 - momentum: 0.000000 2023-10-25 16:03:02,499 epoch 6 - iter 4168/5212 - loss 0.16634156 - time (sec): 176.50 - samples/sec: 1667.11 - lr: 0.000023 - momentum: 0.000000 2023-10-25 16:03:24,979 epoch 6 - iter 4689/5212 - loss 0.16502704 - time (sec): 198.98 - samples/sec: 1667.40 - lr: 0.000023 - momentum: 0.000000 2023-10-25 16:03:46,903 epoch 6 - iter 5210/5212 - loss 0.16841155 - time (sec): 220.90 - samples/sec: 1662.80 - lr: 0.000022 - momentum: 0.000000 2023-10-25 16:03:46,983 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:03:46,983 EPOCH 6 done: loss 0.1684 - lr: 0.000022 2023-10-25 16:03:53,850 DEV : loss 0.20808175206184387 - f1-score (micro avg) 0.2423 2023-10-25 16:03:53,875 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:04:16,243 epoch 7 - iter 521/5212 - loss 0.16586561 - time (sec): 22.37 - samples/sec: 1558.29 - lr: 0.000022 - momentum: 0.000000 2023-10-25 16:04:38,816 epoch 7 - iter 1042/5212 - loss 0.14976359 - time (sec): 44.94 - samples/sec: 1600.71 - lr: 0.000021 - momentum: 0.000000 2023-10-25 16:05:00,525 epoch 7 - iter 1563/5212 - loss 0.14294447 - time (sec): 66.65 - samples/sec: 1584.38 - lr: 0.000021 - momentum: 0.000000 2023-10-25 16:05:22,503 epoch 7 - iter 2084/5212 - loss 0.14958504 - time (sec): 88.63 - samples/sec: 1605.49 - lr: 0.000020 - momentum: 0.000000 2023-10-25 16:05:44,352 epoch 7 - iter 2605/5212 - loss 0.15272797 - time (sec): 110.48 - samples/sec: 1622.73 - lr: 0.000019 - momentum: 0.000000 2023-10-25 16:06:07,447 epoch 7 - iter 3126/5212 - loss 0.15256052 - time (sec): 133.57 - samples/sec: 1647.28 - lr: 0.000019 - momentum: 0.000000 2023-10-25 16:06:29,677 epoch 7 - iter 3647/5212 - loss 0.15252498 - time (sec): 155.80 - samples/sec: 1675.50 - lr: 0.000018 - momentum: 0.000000 2023-10-25 16:06:51,711 epoch 7 - iter 4168/5212 - loss 0.15070024 - time (sec): 177.83 - samples/sec: 1676.15 - lr: 0.000018 - momentum: 0.000000 2023-10-25 16:07:13,611 epoch 7 - iter 4689/5212 - loss 0.15136587 - time (sec): 199.73 - samples/sec: 1674.35 - lr: 0.000017 - momentum: 0.000000 2023-10-25 16:07:35,780 epoch 7 - iter 5210/5212 - loss 0.15018887 - time (sec): 221.90 - samples/sec: 1654.74 - lr: 0.000017 - momentum: 0.000000 2023-10-25 16:07:35,873 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:07:35,873 EPOCH 7 done: loss 0.1502 - lr: 0.000017 2023-10-25 16:07:42,032 DEV : loss 0.2270050346851349 - f1-score (micro avg) 0.2385 2023-10-25 16:07:42,058 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:08:04,521 epoch 8 - iter 521/5212 - loss 0.14589267 - time (sec): 22.46 - samples/sec: 1749.06 - lr: 0.000016 - momentum: 0.000000 2023-10-25 16:08:26,536 epoch 8 - iter 1042/5212 - loss 0.12679147 - time (sec): 44.48 - samples/sec: 1648.21 - lr: 0.000016 - momentum: 0.000000 2023-10-25 16:08:49,455 epoch 8 - iter 1563/5212 - loss 0.13027312 - time (sec): 67.40 - samples/sec: 1629.21 - lr: 0.000015 - momentum: 0.000000 2023-10-25 16:09:11,507 epoch 8 - iter 2084/5212 - loss 0.13178230 - time (sec): 89.45 - samples/sec: 1632.23 - lr: 0.000014 - momentum: 0.000000 2023-10-25 16:09:33,603 epoch 8 - iter 2605/5212 - loss 0.13370436 - time (sec): 111.54 - samples/sec: 1649.83 - lr: 0.000014 - momentum: 0.000000 2023-10-25 16:09:55,760 epoch 8 - iter 3126/5212 - loss 0.13098280 - time (sec): 133.70 - samples/sec: 1673.75 - lr: 0.000013 - momentum: 0.000000 2023-10-25 16:10:17,942 epoch 8 - iter 3647/5212 - loss 0.13384808 - time (sec): 155.88 - samples/sec: 1659.85 - lr: 0.000013 - momentum: 0.000000 2023-10-25 16:10:40,208 epoch 8 - iter 4168/5212 - loss 0.13575297 - time (sec): 178.15 - samples/sec: 1686.01 - lr: 0.000012 - momentum: 0.000000 2023-10-25 16:11:01,988 epoch 8 - iter 4689/5212 - loss 0.13594078 - time (sec): 199.93 - samples/sec: 1671.82 - lr: 0.000012 - momentum: 0.000000 2023-10-25 16:11:23,871 epoch 8 - iter 5210/5212 - loss 0.13594037 - time (sec): 221.81 - samples/sec: 1656.28 - lr: 0.000011 - momentum: 0.000000 2023-10-25 16:11:23,947 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:11:23,947 EPOCH 8 done: loss 0.1359 - lr: 0.000011 2023-10-25 16:11:30,260 DEV : loss 0.2554771304130554 - f1-score (micro avg) 0.2288 2023-10-25 16:11:30,286 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:11:52,419 epoch 9 - iter 521/5212 - loss 0.11567060 - time (sec): 22.13 - samples/sec: 1548.27 - lr: 0.000011 - momentum: 0.000000 2023-10-25 16:12:14,779 epoch 9 - iter 1042/5212 - loss 0.12925166 - time (sec): 44.49 - samples/sec: 1633.21 - lr: 0.000010 - momentum: 0.000000 2023-10-25 16:12:36,602 epoch 9 - iter 1563/5212 - loss 0.12394185 - time (sec): 66.31 - samples/sec: 1646.06 - lr: 0.000009 - momentum: 0.000000 2023-10-25 16:12:58,308 epoch 9 - iter 2084/5212 - loss 0.12623309 - time (sec): 88.02 - samples/sec: 1642.85 - lr: 0.000009 - momentum: 0.000000 2023-10-25 16:13:21,115 epoch 9 - iter 2605/5212 - loss 0.12686418 - time (sec): 110.83 - samples/sec: 1621.52 - lr: 0.000008 - momentum: 0.000000 2023-10-25 16:13:43,274 epoch 9 - iter 3126/5212 - loss 0.12617219 - time (sec): 132.99 - samples/sec: 1632.06 - lr: 0.000008 - momentum: 0.000000 2023-10-25 16:14:06,024 epoch 9 - iter 3647/5212 - loss 0.12470505 - time (sec): 155.74 - samples/sec: 1630.26 - lr: 0.000007 - momentum: 0.000000 2023-10-25 16:14:28,476 epoch 9 - iter 4168/5212 - loss 0.12338268 - time (sec): 178.19 - samples/sec: 1647.55 - lr: 0.000007 - momentum: 0.000000 2023-10-25 16:14:50,431 epoch 9 - iter 4689/5212 - loss 0.12303248 - time (sec): 200.14 - samples/sec: 1655.20 - lr: 0.000006 - momentum: 0.000000 2023-10-25 16:15:13,282 epoch 9 - iter 5210/5212 - loss 0.12380050 - time (sec): 222.99 - samples/sec: 1647.36 - lr: 0.000006 - momentum: 0.000000 2023-10-25 16:15:13,380 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:15:13,380 EPOCH 9 done: loss 0.1238 - lr: 0.000006 2023-10-25 16:15:20,024 DEV : loss 0.25904580950737 - f1-score (micro avg) 0.2442 2023-10-25 16:15:20,051 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:15:42,601 epoch 10 - iter 521/5212 - loss 0.09950056 - time (sec): 22.55 - samples/sec: 1569.28 - lr: 0.000005 - momentum: 0.000000 2023-10-25 16:16:05,408 epoch 10 - iter 1042/5212 - loss 0.09962004 - time (sec): 45.36 - samples/sec: 1621.04 - lr: 0.000004 - momentum: 0.000000 2023-10-25 16:16:28,632 epoch 10 - iter 1563/5212 - loss 0.10130806 - time (sec): 68.58 - samples/sec: 1566.23 - lr: 0.000004 - momentum: 0.000000 2023-10-25 16:16:50,806 epoch 10 - iter 2084/5212 - loss 0.10676368 - time (sec): 90.75 - samples/sec: 1600.53 - lr: 0.000003 - momentum: 0.000000 2023-10-25 16:17:12,875 epoch 10 - iter 2605/5212 - loss 0.11022598 - time (sec): 112.82 - samples/sec: 1629.67 - lr: 0.000003 - momentum: 0.000000 2023-10-25 16:17:35,096 epoch 10 - iter 3126/5212 - loss 0.10822074 - time (sec): 135.04 - samples/sec: 1640.39 - lr: 0.000002 - momentum: 0.000000 2023-10-25 16:17:57,698 epoch 10 - iter 3647/5212 - loss 0.11044716 - time (sec): 157.65 - samples/sec: 1638.54 - lr: 0.000002 - momentum: 0.000000 2023-10-25 16:18:20,171 epoch 10 - iter 4168/5212 - loss 0.11138827 - time (sec): 180.12 - samples/sec: 1634.56 - lr: 0.000001 - momentum: 0.000000 2023-10-25 16:18:42,613 epoch 10 - iter 4689/5212 - loss 0.11209184 - time (sec): 202.56 - samples/sec: 1624.26 - lr: 0.000001 - momentum: 0.000000 2023-10-25 16:19:04,757 epoch 10 - iter 5210/5212 - loss 0.11119727 - time (sec): 224.70 - samples/sec: 1634.63 - lr: 0.000000 - momentum: 0.000000 2023-10-25 16:19:04,836 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:19:04,836 EPOCH 10 done: loss 0.1112 - lr: 0.000000 2023-10-25 16:19:11,706 DEV : loss 0.2723826766014099 - f1-score (micro avg) 0.232 2023-10-25 16:19:12,230 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:19:12,231 Loading model from best epoch ... 2023-10-25 16:19:13,972 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-25 16:19:23,755 Results: - F-score (micro) 0.3408 - F-score (macro) 0.2134 - Accuracy 0.2086 By class: precision recall f1-score support LOC 0.4309 0.4876 0.4575 1214 PER 0.2575 0.2970 0.2759 808 ORG 0.1042 0.1416 0.1200 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.3166 0.3690 0.3408 2390 macro avg 0.1981 0.2316 0.2134 2390 weighted avg 0.3213 0.3690 0.3434 2390 2023-10-25 16:19:23,755 ----------------------------------------------------------------------------------------------------