stefan-it's picture
Upload folder using huggingface_hub
abbcf5d
2023-10-17 10:46:58,799 ----------------------------------------------------------------------------------------------------
2023-10-17 10:46:58,800 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 10:46:58,801 ----------------------------------------------------------------------------------------------------
2023-10-17 10:46:58,801 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-17 10:46:58,801 ----------------------------------------------------------------------------------------------------
2023-10-17 10:46:58,801 Train: 6183 sentences
2023-10-17 10:46:58,801 (train_with_dev=False, train_with_test=False)
2023-10-17 10:46:58,801 ----------------------------------------------------------------------------------------------------
2023-10-17 10:46:58,801 Training Params:
2023-10-17 10:46:58,801 - learning_rate: "3e-05"
2023-10-17 10:46:58,801 - mini_batch_size: "8"
2023-10-17 10:46:58,801 - max_epochs: "10"
2023-10-17 10:46:58,802 - shuffle: "True"
2023-10-17 10:46:58,802 ----------------------------------------------------------------------------------------------------
2023-10-17 10:46:58,802 Plugins:
2023-10-17 10:46:58,802 - TensorboardLogger
2023-10-17 10:46:58,802 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 10:46:58,802 ----------------------------------------------------------------------------------------------------
2023-10-17 10:46:58,802 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 10:46:58,802 - metric: "('micro avg', 'f1-score')"
2023-10-17 10:46:58,802 ----------------------------------------------------------------------------------------------------
2023-10-17 10:46:58,802 Computation:
2023-10-17 10:46:58,802 - compute on device: cuda:0
2023-10-17 10:46:58,802 - embedding storage: none
2023-10-17 10:46:58,802 ----------------------------------------------------------------------------------------------------
2023-10-17 10:46:58,802 Model training base path: "hmbench-topres19th/en-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-17 10:46:58,802 ----------------------------------------------------------------------------------------------------
2023-10-17 10:46:58,803 ----------------------------------------------------------------------------------------------------
2023-10-17 10:46:58,803 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 10:47:05,789 epoch 1 - iter 77/773 - loss 2.84092854 - time (sec): 6.98 - samples/sec: 1617.95 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:47:13,250 epoch 1 - iter 154/773 - loss 1.53424492 - time (sec): 14.45 - samples/sec: 1717.08 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:47:20,771 epoch 1 - iter 231/773 - loss 1.08862813 - time (sec): 21.97 - samples/sec: 1719.41 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:47:28,354 epoch 1 - iter 308/773 - loss 0.85171098 - time (sec): 29.55 - samples/sec: 1710.60 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:47:35,897 epoch 1 - iter 385/773 - loss 0.70544828 - time (sec): 37.09 - samples/sec: 1707.17 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:47:43,030 epoch 1 - iter 462/773 - loss 0.60370873 - time (sec): 44.23 - samples/sec: 1722.03 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:47:49,959 epoch 1 - iter 539/773 - loss 0.53873054 - time (sec): 51.15 - samples/sec: 1710.83 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:47:56,937 epoch 1 - iter 616/773 - loss 0.48527896 - time (sec): 58.13 - samples/sec: 1716.28 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:48:04,037 epoch 1 - iter 693/773 - loss 0.44261791 - time (sec): 65.23 - samples/sec: 1722.47 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:48:11,055 epoch 1 - iter 770/773 - loss 0.41025787 - time (sec): 72.25 - samples/sec: 1715.14 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:48:11,320 ----------------------------------------------------------------------------------------------------
2023-10-17 10:48:11,320 EPOCH 1 done: loss 0.4094 - lr: 0.000030
2023-10-17 10:48:13,960 DEV : loss 0.05252358317375183 - f1-score (micro avg) 0.7838
2023-10-17 10:48:13,989 saving best model
2023-10-17 10:48:14,524 ----------------------------------------------------------------------------------------------------
2023-10-17 10:48:21,570 epoch 2 - iter 77/773 - loss 0.09652664 - time (sec): 7.04 - samples/sec: 1728.04 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:48:28,751 epoch 2 - iter 154/773 - loss 0.08979408 - time (sec): 14.22 - samples/sec: 1771.30 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:48:35,823 epoch 2 - iter 231/773 - loss 0.08806850 - time (sec): 21.30 - samples/sec: 1734.24 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:48:43,040 epoch 2 - iter 308/773 - loss 0.08255607 - time (sec): 28.51 - samples/sec: 1729.09 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:48:50,429 epoch 2 - iter 385/773 - loss 0.08195398 - time (sec): 35.90 - samples/sec: 1707.30 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:48:57,538 epoch 2 - iter 462/773 - loss 0.07976764 - time (sec): 43.01 - samples/sec: 1704.63 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:49:04,676 epoch 2 - iter 539/773 - loss 0.07710304 - time (sec): 50.15 - samples/sec: 1715.42 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:49:11,854 epoch 2 - iter 616/773 - loss 0.07732638 - time (sec): 57.33 - samples/sec: 1717.14 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:49:18,955 epoch 2 - iter 693/773 - loss 0.07637746 - time (sec): 64.43 - samples/sec: 1724.33 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:49:26,241 epoch 2 - iter 770/773 - loss 0.07445810 - time (sec): 71.71 - samples/sec: 1728.57 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:49:26,504 ----------------------------------------------------------------------------------------------------
2023-10-17 10:49:26,504 EPOCH 2 done: loss 0.0744 - lr: 0.000027
2023-10-17 10:49:29,437 DEV : loss 0.055021319538354874 - f1-score (micro avg) 0.7724
2023-10-17 10:49:29,470 ----------------------------------------------------------------------------------------------------
2023-10-17 10:49:36,699 epoch 3 - iter 77/773 - loss 0.06692996 - time (sec): 7.23 - samples/sec: 1625.39 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:49:44,491 epoch 3 - iter 154/773 - loss 0.05595855 - time (sec): 15.02 - samples/sec: 1705.92 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:49:52,092 epoch 3 - iter 231/773 - loss 0.05237529 - time (sec): 22.62 - samples/sec: 1713.92 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:49:58,924 epoch 3 - iter 308/773 - loss 0.05167815 - time (sec): 29.45 - samples/sec: 1708.51 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:50:05,851 epoch 3 - iter 385/773 - loss 0.05019417 - time (sec): 36.38 - samples/sec: 1712.12 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:50:13,249 epoch 3 - iter 462/773 - loss 0.05060451 - time (sec): 43.78 - samples/sec: 1713.86 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:50:20,307 epoch 3 - iter 539/773 - loss 0.04890835 - time (sec): 50.84 - samples/sec: 1713.08 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:50:27,552 epoch 3 - iter 616/773 - loss 0.04926124 - time (sec): 58.08 - samples/sec: 1711.91 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:50:34,925 epoch 3 - iter 693/773 - loss 0.04820747 - time (sec): 65.45 - samples/sec: 1715.27 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:50:42,182 epoch 3 - iter 770/773 - loss 0.04749550 - time (sec): 72.71 - samples/sec: 1704.16 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:50:42,457 ----------------------------------------------------------------------------------------------------
2023-10-17 10:50:42,457 EPOCH 3 done: loss 0.0474 - lr: 0.000023
2023-10-17 10:50:45,693 DEV : loss 0.07840536534786224 - f1-score (micro avg) 0.7919
2023-10-17 10:50:45,730 saving best model
2023-10-17 10:50:47,205 ----------------------------------------------------------------------------------------------------
2023-10-17 10:50:54,491 epoch 4 - iter 77/773 - loss 0.03791889 - time (sec): 7.28 - samples/sec: 1781.26 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:51:01,614 epoch 4 - iter 154/773 - loss 0.03523555 - time (sec): 14.41 - samples/sec: 1757.71 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:51:08,817 epoch 4 - iter 231/773 - loss 0.03503291 - time (sec): 21.61 - samples/sec: 1723.30 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:51:16,053 epoch 4 - iter 308/773 - loss 0.03382046 - time (sec): 28.84 - samples/sec: 1708.12 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:51:23,255 epoch 4 - iter 385/773 - loss 0.03384325 - time (sec): 36.05 - samples/sec: 1723.94 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:51:30,499 epoch 4 - iter 462/773 - loss 0.03365221 - time (sec): 43.29 - samples/sec: 1718.87 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:51:38,090 epoch 4 - iter 539/773 - loss 0.03394297 - time (sec): 50.88 - samples/sec: 1702.50 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:51:45,197 epoch 4 - iter 616/773 - loss 0.03375060 - time (sec): 57.99 - samples/sec: 1717.16 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:51:52,365 epoch 4 - iter 693/773 - loss 0.03369243 - time (sec): 65.16 - samples/sec: 1718.67 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:51:59,480 epoch 4 - iter 770/773 - loss 0.03218794 - time (sec): 72.27 - samples/sec: 1713.19 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:51:59,764 ----------------------------------------------------------------------------------------------------
2023-10-17 10:51:59,765 EPOCH 4 done: loss 0.0322 - lr: 0.000020
2023-10-17 10:52:02,651 DEV : loss 0.09961310774087906 - f1-score (micro avg) 0.796
2023-10-17 10:52:02,682 saving best model
2023-10-17 10:52:04,149 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:11,494 epoch 5 - iter 77/773 - loss 0.01433858 - time (sec): 7.34 - samples/sec: 1665.90 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:52:18,610 epoch 5 - iter 154/773 - loss 0.01668492 - time (sec): 14.45 - samples/sec: 1654.98 - lr: 0.000019 - momentum: 0.000000
2023-10-17 10:52:25,909 epoch 5 - iter 231/773 - loss 0.01582806 - time (sec): 21.75 - samples/sec: 1643.04 - lr: 0.000019 - momentum: 0.000000
2023-10-17 10:52:33,185 epoch 5 - iter 308/773 - loss 0.01747638 - time (sec): 29.03 - samples/sec: 1657.76 - lr: 0.000019 - momentum: 0.000000
2023-10-17 10:52:40,748 epoch 5 - iter 385/773 - loss 0.01907711 - time (sec): 36.59 - samples/sec: 1652.85 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:52:48,237 epoch 5 - iter 462/773 - loss 0.01962981 - time (sec): 44.08 - samples/sec: 1653.28 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:52:55,906 epoch 5 - iter 539/773 - loss 0.02022564 - time (sec): 51.75 - samples/sec: 1674.60 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:53:03,049 epoch 5 - iter 616/773 - loss 0.02094257 - time (sec): 58.89 - samples/sec: 1683.39 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:53:10,152 epoch 5 - iter 693/773 - loss 0.02132988 - time (sec): 66.00 - samples/sec: 1682.38 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:53:17,390 epoch 5 - iter 770/773 - loss 0.02276432 - time (sec): 73.23 - samples/sec: 1691.35 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:53:17,652 ----------------------------------------------------------------------------------------------------
2023-10-17 10:53:17,653 EPOCH 5 done: loss 0.0228 - lr: 0.000017
2023-10-17 10:53:20,568 DEV : loss 0.09981973469257355 - f1-score (micro avg) 0.7714
2023-10-17 10:53:20,598 ----------------------------------------------------------------------------------------------------
2023-10-17 10:53:27,937 epoch 6 - iter 77/773 - loss 0.01178514 - time (sec): 7.34 - samples/sec: 1692.35 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:53:35,153 epoch 6 - iter 154/773 - loss 0.01440421 - time (sec): 14.55 - samples/sec: 1715.19 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:53:42,304 epoch 6 - iter 231/773 - loss 0.01414618 - time (sec): 21.70 - samples/sec: 1732.08 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:53:49,943 epoch 6 - iter 308/773 - loss 0.01396628 - time (sec): 29.34 - samples/sec: 1709.14 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:53:57,579 epoch 6 - iter 385/773 - loss 0.01453643 - time (sec): 36.98 - samples/sec: 1681.34 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:54:06,017 epoch 6 - iter 462/773 - loss 0.01514750 - time (sec): 45.42 - samples/sec: 1641.00 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:54:13,503 epoch 6 - iter 539/773 - loss 0.01444656 - time (sec): 52.90 - samples/sec: 1660.58 - lr: 0.000014 - momentum: 0.000000
2023-10-17 10:54:20,944 epoch 6 - iter 616/773 - loss 0.01500934 - time (sec): 60.34 - samples/sec: 1648.86 - lr: 0.000014 - momentum: 0.000000
2023-10-17 10:54:28,692 epoch 6 - iter 693/773 - loss 0.01459877 - time (sec): 68.09 - samples/sec: 1638.37 - lr: 0.000014 - momentum: 0.000000
2023-10-17 10:54:35,865 epoch 6 - iter 770/773 - loss 0.01529271 - time (sec): 75.27 - samples/sec: 1645.13 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:54:36,121 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:36,121 EPOCH 6 done: loss 0.0152 - lr: 0.000013
2023-10-17 10:54:39,064 DEV : loss 0.10781947523355484 - f1-score (micro avg) 0.7951
2023-10-17 10:54:39,093 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:46,077 epoch 7 - iter 77/773 - loss 0.00893014 - time (sec): 6.98 - samples/sec: 1798.13 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:54:53,444 epoch 7 - iter 154/773 - loss 0.00871907 - time (sec): 14.35 - samples/sec: 1821.79 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:55:00,489 epoch 7 - iter 231/773 - loss 0.00923813 - time (sec): 21.39 - samples/sec: 1793.59 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:55:08,020 epoch 7 - iter 308/773 - loss 0.00995155 - time (sec): 28.93 - samples/sec: 1745.50 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:55:14,908 epoch 7 - iter 385/773 - loss 0.01128060 - time (sec): 35.81 - samples/sec: 1749.45 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:55:21,855 epoch 7 - iter 462/773 - loss 0.01127523 - time (sec): 42.76 - samples/sec: 1734.82 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:55:28,920 epoch 7 - iter 539/773 - loss 0.01198194 - time (sec): 49.83 - samples/sec: 1727.98 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:55:35,930 epoch 7 - iter 616/773 - loss 0.01116719 - time (sec): 56.84 - samples/sec: 1740.19 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:55:43,129 epoch 7 - iter 693/773 - loss 0.01065408 - time (sec): 64.03 - samples/sec: 1745.57 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:55:50,299 epoch 7 - iter 770/773 - loss 0.01059549 - time (sec): 71.20 - samples/sec: 1740.53 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:55:50,572 ----------------------------------------------------------------------------------------------------
2023-10-17 10:55:50,572 EPOCH 7 done: loss 0.0106 - lr: 0.000010
2023-10-17 10:55:53,470 DEV : loss 0.11879457533359528 - f1-score (micro avg) 0.8137
2023-10-17 10:55:53,500 saving best model
2023-10-17 10:55:54,945 ----------------------------------------------------------------------------------------------------
2023-10-17 10:56:02,442 epoch 8 - iter 77/773 - loss 0.00849530 - time (sec): 7.49 - samples/sec: 1650.42 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:56:09,696 epoch 8 - iter 154/773 - loss 0.00668992 - time (sec): 14.75 - samples/sec: 1715.41 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:56:16,634 epoch 8 - iter 231/773 - loss 0.00632277 - time (sec): 21.68 - samples/sec: 1711.90 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:56:24,003 epoch 8 - iter 308/773 - loss 0.00710856 - time (sec): 29.05 - samples/sec: 1703.90 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:56:31,183 epoch 8 - iter 385/773 - loss 0.00694469 - time (sec): 36.23 - samples/sec: 1720.14 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:56:38,232 epoch 8 - iter 462/773 - loss 0.00715802 - time (sec): 43.28 - samples/sec: 1733.62 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:56:45,558 epoch 8 - iter 539/773 - loss 0.00732958 - time (sec): 50.61 - samples/sec: 1721.98 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:56:52,703 epoch 8 - iter 616/773 - loss 0.00761398 - time (sec): 57.75 - samples/sec: 1717.23 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:57:00,365 epoch 8 - iter 693/773 - loss 0.00808954 - time (sec): 65.42 - samples/sec: 1707.19 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:57:07,847 epoch 8 - iter 770/773 - loss 0.00789069 - time (sec): 72.90 - samples/sec: 1697.25 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:57:08,147 ----------------------------------------------------------------------------------------------------
2023-10-17 10:57:08,147 EPOCH 8 done: loss 0.0079 - lr: 0.000007
2023-10-17 10:57:11,061 DEV : loss 0.11245165020227432 - f1-score (micro avg) 0.8065
2023-10-17 10:57:11,090 ----------------------------------------------------------------------------------------------------
2023-10-17 10:57:17,927 epoch 9 - iter 77/773 - loss 0.00678704 - time (sec): 6.83 - samples/sec: 1792.37 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:57:25,076 epoch 9 - iter 154/773 - loss 0.00603493 - time (sec): 13.98 - samples/sec: 1728.89 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:57:32,293 epoch 9 - iter 231/773 - loss 0.00487800 - time (sec): 21.20 - samples/sec: 1781.00 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:57:40,077 epoch 9 - iter 308/773 - loss 0.00497749 - time (sec): 28.98 - samples/sec: 1707.92 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:57:47,281 epoch 9 - iter 385/773 - loss 0.00481061 - time (sec): 36.19 - samples/sec: 1699.34 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:57:54,356 epoch 9 - iter 462/773 - loss 0.00526176 - time (sec): 43.26 - samples/sec: 1696.55 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:58:01,333 epoch 9 - iter 539/773 - loss 0.00508493 - time (sec): 50.24 - samples/sec: 1702.12 - lr: 0.000004 - momentum: 0.000000
2023-10-17 10:58:08,962 epoch 9 - iter 616/773 - loss 0.00501247 - time (sec): 57.87 - samples/sec: 1715.29 - lr: 0.000004 - momentum: 0.000000
2023-10-17 10:58:15,614 epoch 9 - iter 693/773 - loss 0.00488110 - time (sec): 64.52 - samples/sec: 1722.18 - lr: 0.000004 - momentum: 0.000000
2023-10-17 10:58:22,667 epoch 9 - iter 770/773 - loss 0.00486167 - time (sec): 71.58 - samples/sec: 1732.27 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:58:22,949 ----------------------------------------------------------------------------------------------------
2023-10-17 10:58:22,950 EPOCH 9 done: loss 0.0049 - lr: 0.000003
2023-10-17 10:58:26,006 DEV : loss 0.1249605268239975 - f1-score (micro avg) 0.7984
2023-10-17 10:58:26,042 ----------------------------------------------------------------------------------------------------
2023-10-17 10:58:33,332 epoch 10 - iter 77/773 - loss 0.00242796 - time (sec): 7.29 - samples/sec: 1792.63 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:58:40,248 epoch 10 - iter 154/773 - loss 0.00282294 - time (sec): 14.20 - samples/sec: 1729.26 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:58:47,465 epoch 10 - iter 231/773 - loss 0.00368862 - time (sec): 21.42 - samples/sec: 1696.13 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:58:54,604 epoch 10 - iter 308/773 - loss 0.00321279 - time (sec): 28.56 - samples/sec: 1716.83 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:59:01,489 epoch 10 - iter 385/773 - loss 0.00288216 - time (sec): 35.44 - samples/sec: 1725.88 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:59:08,465 epoch 10 - iter 462/773 - loss 0.00268120 - time (sec): 42.42 - samples/sec: 1726.17 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:59:15,643 epoch 10 - iter 539/773 - loss 0.00271947 - time (sec): 49.60 - samples/sec: 1736.60 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:59:22,844 epoch 10 - iter 616/773 - loss 0.00268237 - time (sec): 56.80 - samples/sec: 1764.79 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:59:29,999 epoch 10 - iter 693/773 - loss 0.00287023 - time (sec): 63.95 - samples/sec: 1747.91 - lr: 0.000000 - momentum: 0.000000
2023-10-17 10:59:37,487 epoch 10 - iter 770/773 - loss 0.00290057 - time (sec): 71.44 - samples/sec: 1734.34 - lr: 0.000000 - momentum: 0.000000
2023-10-17 10:59:37,759 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:37,759 EPOCH 10 done: loss 0.0029 - lr: 0.000000
2023-10-17 10:59:40,948 DEV : loss 0.13062351942062378 - f1-score (micro avg) 0.8041
2023-10-17 10:59:41,616 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:41,618 Loading model from best epoch ...
2023-10-17 10:59:44,193 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-17 10:59:53,111
Results:
- F-score (micro) 0.8174
- F-score (macro) 0.7401
- Accuracy 0.7088
By class:
precision recall f1-score support
LOC 0.8804 0.8404 0.8599 946
BUILDING 0.6604 0.5676 0.6105 185
STREET 0.7500 0.7500 0.7500 56
micro avg 0.8426 0.7936 0.8174 1187
macro avg 0.7636 0.7193 0.7401 1187
weighted avg 0.8400 0.7936 0.8159 1187
2023-10-17 10:59:53,111 ----------------------------------------------------------------------------------------------------