2023-10-17 10:46:58,799 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:46:58,800 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 10:46:58,801 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:46:58,801 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-17 10:46:58,801 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:46:58,801 Train: 6183 sentences 2023-10-17 10:46:58,801 (train_with_dev=False, train_with_test=False) 2023-10-17 10:46:58,801 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:46:58,801 Training Params: 2023-10-17 10:46:58,801 - learning_rate: "3e-05" 2023-10-17 10:46:58,801 - mini_batch_size: "8" 2023-10-17 10:46:58,801 - max_epochs: "10" 2023-10-17 10:46:58,802 - shuffle: "True" 2023-10-17 10:46:58,802 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:46:58,802 Plugins: 2023-10-17 10:46:58,802 - TensorboardLogger 2023-10-17 10:46:58,802 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 10:46:58,802 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:46:58,802 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 10:46:58,802 - metric: "('micro avg', 'f1-score')" 2023-10-17 10:46:58,802 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:46:58,802 Computation: 2023-10-17 10:46:58,802 - compute on device: cuda:0 2023-10-17 10:46:58,802 - embedding storage: none 2023-10-17 10:46:58,802 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:46:58,802 Model training base path: "hmbench-topres19th/en-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-17 10:46:58,802 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:46:58,803 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:46:58,803 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 10:47:05,789 epoch 1 - iter 77/773 - loss 2.84092854 - time (sec): 6.98 - samples/sec: 1617.95 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:47:13,250 epoch 1 - iter 154/773 - loss 1.53424492 - time (sec): 14.45 - samples/sec: 1717.08 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:47:20,771 epoch 1 - iter 231/773 - loss 1.08862813 - time (sec): 21.97 - samples/sec: 1719.41 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:47:28,354 epoch 1 - iter 308/773 - loss 0.85171098 - time (sec): 29.55 - samples/sec: 1710.60 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:47:35,897 epoch 1 - iter 385/773 - loss 0.70544828 - time (sec): 37.09 - samples/sec: 1707.17 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:47:43,030 epoch 1 - iter 462/773 - loss 0.60370873 - time (sec): 44.23 - samples/sec: 1722.03 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:47:49,959 epoch 1 - iter 539/773 - loss 0.53873054 - time (sec): 51.15 - samples/sec: 1710.83 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:47:56,937 epoch 1 - iter 616/773 - loss 0.48527896 - time (sec): 58.13 - samples/sec: 1716.28 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:48:04,037 epoch 1 - iter 693/773 - loss 0.44261791 - time (sec): 65.23 - samples/sec: 1722.47 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:48:11,055 epoch 1 - iter 770/773 - loss 0.41025787 - time (sec): 72.25 - samples/sec: 1715.14 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:48:11,320 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:48:11,320 EPOCH 1 done: loss 0.4094 - lr: 0.000030 2023-10-17 10:48:13,960 DEV : loss 0.05252358317375183 - f1-score (micro avg) 0.7838 2023-10-17 10:48:13,989 saving best model 2023-10-17 10:48:14,524 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:48:21,570 epoch 2 - iter 77/773 - loss 0.09652664 - time (sec): 7.04 - samples/sec: 1728.04 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:48:28,751 epoch 2 - iter 154/773 - loss 0.08979408 - time (sec): 14.22 - samples/sec: 1771.30 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:48:35,823 epoch 2 - iter 231/773 - loss 0.08806850 - time (sec): 21.30 - samples/sec: 1734.24 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:48:43,040 epoch 2 - iter 308/773 - loss 0.08255607 - time (sec): 28.51 - samples/sec: 1729.09 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:48:50,429 epoch 2 - iter 385/773 - loss 0.08195398 - time (sec): 35.90 - samples/sec: 1707.30 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:48:57,538 epoch 2 - iter 462/773 - loss 0.07976764 - time (sec): 43.01 - samples/sec: 1704.63 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:49:04,676 epoch 2 - iter 539/773 - loss 0.07710304 - time (sec): 50.15 - samples/sec: 1715.42 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:49:11,854 epoch 2 - iter 616/773 - loss 0.07732638 - time (sec): 57.33 - samples/sec: 1717.14 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:49:18,955 epoch 2 - iter 693/773 - loss 0.07637746 - time (sec): 64.43 - samples/sec: 1724.33 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:49:26,241 epoch 2 - iter 770/773 - loss 0.07445810 - time (sec): 71.71 - samples/sec: 1728.57 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:49:26,504 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:49:26,504 EPOCH 2 done: loss 0.0744 - lr: 0.000027 2023-10-17 10:49:29,437 DEV : loss 0.055021319538354874 - f1-score (micro avg) 0.7724 2023-10-17 10:49:29,470 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:49:36,699 epoch 3 - iter 77/773 - loss 0.06692996 - time (sec): 7.23 - samples/sec: 1625.39 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:49:44,491 epoch 3 - iter 154/773 - loss 0.05595855 - time (sec): 15.02 - samples/sec: 1705.92 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:49:52,092 epoch 3 - iter 231/773 - loss 0.05237529 - time (sec): 22.62 - samples/sec: 1713.92 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:49:58,924 epoch 3 - iter 308/773 - loss 0.05167815 - time (sec): 29.45 - samples/sec: 1708.51 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:50:05,851 epoch 3 - iter 385/773 - loss 0.05019417 - time (sec): 36.38 - samples/sec: 1712.12 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:50:13,249 epoch 3 - iter 462/773 - loss 0.05060451 - time (sec): 43.78 - samples/sec: 1713.86 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:50:20,307 epoch 3 - iter 539/773 - loss 0.04890835 - time (sec): 50.84 - samples/sec: 1713.08 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:50:27,552 epoch 3 - iter 616/773 - loss 0.04926124 - time (sec): 58.08 - samples/sec: 1711.91 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:50:34,925 epoch 3 - iter 693/773 - loss 0.04820747 - time (sec): 65.45 - samples/sec: 1715.27 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:50:42,182 epoch 3 - iter 770/773 - loss 0.04749550 - time (sec): 72.71 - samples/sec: 1704.16 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:50:42,457 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:42,457 EPOCH 3 done: loss 0.0474 - lr: 0.000023 2023-10-17 10:50:45,693 DEV : loss 0.07840536534786224 - f1-score (micro avg) 0.7919 2023-10-17 10:50:45,730 saving best model 2023-10-17 10:50:47,205 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:54,491 epoch 4 - iter 77/773 - loss 0.03791889 - time (sec): 7.28 - samples/sec: 1781.26 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:51:01,614 epoch 4 - iter 154/773 - loss 0.03523555 - time (sec): 14.41 - samples/sec: 1757.71 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:51:08,817 epoch 4 - iter 231/773 - loss 0.03503291 - time (sec): 21.61 - samples/sec: 1723.30 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:51:16,053 epoch 4 - iter 308/773 - loss 0.03382046 - time (sec): 28.84 - samples/sec: 1708.12 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:51:23,255 epoch 4 - iter 385/773 - loss 0.03384325 - time (sec): 36.05 - samples/sec: 1723.94 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:51:30,499 epoch 4 - iter 462/773 - loss 0.03365221 - time (sec): 43.29 - samples/sec: 1718.87 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:51:38,090 epoch 4 - iter 539/773 - loss 0.03394297 - time (sec): 50.88 - samples/sec: 1702.50 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:51:45,197 epoch 4 - iter 616/773 - loss 0.03375060 - time (sec): 57.99 - samples/sec: 1717.16 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:51:52,365 epoch 4 - iter 693/773 - loss 0.03369243 - time (sec): 65.16 - samples/sec: 1718.67 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:51:59,480 epoch 4 - iter 770/773 - loss 0.03218794 - time (sec): 72.27 - samples/sec: 1713.19 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:51:59,764 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:51:59,765 EPOCH 4 done: loss 0.0322 - lr: 0.000020 2023-10-17 10:52:02,651 DEV : loss 0.09961310774087906 - f1-score (micro avg) 0.796 2023-10-17 10:52:02,682 saving best model 2023-10-17 10:52:04,149 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:52:11,494 epoch 5 - iter 77/773 - loss 0.01433858 - time (sec): 7.34 - samples/sec: 1665.90 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:52:18,610 epoch 5 - iter 154/773 - loss 0.01668492 - time (sec): 14.45 - samples/sec: 1654.98 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:52:25,909 epoch 5 - iter 231/773 - loss 0.01582806 - time (sec): 21.75 - samples/sec: 1643.04 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:52:33,185 epoch 5 - iter 308/773 - loss 0.01747638 - time (sec): 29.03 - samples/sec: 1657.76 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:52:40,748 epoch 5 - iter 385/773 - loss 0.01907711 - time (sec): 36.59 - samples/sec: 1652.85 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:52:48,237 epoch 5 - iter 462/773 - loss 0.01962981 - time (sec): 44.08 - samples/sec: 1653.28 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:52:55,906 epoch 5 - iter 539/773 - loss 0.02022564 - time (sec): 51.75 - samples/sec: 1674.60 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:53:03,049 epoch 5 - iter 616/773 - loss 0.02094257 - time (sec): 58.89 - samples/sec: 1683.39 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:53:10,152 epoch 5 - iter 693/773 - loss 0.02132988 - time (sec): 66.00 - samples/sec: 1682.38 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:53:17,390 epoch 5 - iter 770/773 - loss 0.02276432 - time (sec): 73.23 - samples/sec: 1691.35 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:53:17,652 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:53:17,653 EPOCH 5 done: loss 0.0228 - lr: 0.000017 2023-10-17 10:53:20,568 DEV : loss 0.09981973469257355 - f1-score (micro avg) 0.7714 2023-10-17 10:53:20,598 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:53:27,937 epoch 6 - iter 77/773 - loss 0.01178514 - time (sec): 7.34 - samples/sec: 1692.35 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:53:35,153 epoch 6 - iter 154/773 - loss 0.01440421 - time (sec): 14.55 - samples/sec: 1715.19 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:53:42,304 epoch 6 - iter 231/773 - loss 0.01414618 - time (sec): 21.70 - samples/sec: 1732.08 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:53:49,943 epoch 6 - iter 308/773 - loss 0.01396628 - time (sec): 29.34 - samples/sec: 1709.14 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:53:57,579 epoch 6 - iter 385/773 - loss 0.01453643 - time (sec): 36.98 - samples/sec: 1681.34 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:54:06,017 epoch 6 - iter 462/773 - loss 0.01514750 - time (sec): 45.42 - samples/sec: 1641.00 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:54:13,503 epoch 6 - iter 539/773 - loss 0.01444656 - time (sec): 52.90 - samples/sec: 1660.58 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:54:20,944 epoch 6 - iter 616/773 - loss 0.01500934 - time (sec): 60.34 - samples/sec: 1648.86 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:54:28,692 epoch 6 - iter 693/773 - loss 0.01459877 - time (sec): 68.09 - samples/sec: 1638.37 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:54:35,865 epoch 6 - iter 770/773 - loss 0.01529271 - time (sec): 75.27 - samples/sec: 1645.13 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:54:36,121 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:54:36,121 EPOCH 6 done: loss 0.0152 - lr: 0.000013 2023-10-17 10:54:39,064 DEV : loss 0.10781947523355484 - f1-score (micro avg) 0.7951 2023-10-17 10:54:39,093 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:54:46,077 epoch 7 - iter 77/773 - loss 0.00893014 - time (sec): 6.98 - samples/sec: 1798.13 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:54:53,444 epoch 7 - iter 154/773 - loss 0.00871907 - time (sec): 14.35 - samples/sec: 1821.79 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:55:00,489 epoch 7 - iter 231/773 - loss 0.00923813 - time (sec): 21.39 - samples/sec: 1793.59 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:55:08,020 epoch 7 - iter 308/773 - loss 0.00995155 - time (sec): 28.93 - samples/sec: 1745.50 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:55:14,908 epoch 7 - iter 385/773 - loss 0.01128060 - time (sec): 35.81 - samples/sec: 1749.45 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:55:21,855 epoch 7 - iter 462/773 - loss 0.01127523 - time (sec): 42.76 - samples/sec: 1734.82 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:55:28,920 epoch 7 - iter 539/773 - loss 0.01198194 - time (sec): 49.83 - samples/sec: 1727.98 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:55:35,930 epoch 7 - iter 616/773 - loss 0.01116719 - time (sec): 56.84 - samples/sec: 1740.19 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:55:43,129 epoch 7 - iter 693/773 - loss 0.01065408 - time (sec): 64.03 - samples/sec: 1745.57 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:55:50,299 epoch 7 - iter 770/773 - loss 0.01059549 - time (sec): 71.20 - samples/sec: 1740.53 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:55:50,572 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:55:50,572 EPOCH 7 done: loss 0.0106 - lr: 0.000010 2023-10-17 10:55:53,470 DEV : loss 0.11879457533359528 - f1-score (micro avg) 0.8137 2023-10-17 10:55:53,500 saving best model 2023-10-17 10:55:54,945 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:56:02,442 epoch 8 - iter 77/773 - loss 0.00849530 - time (sec): 7.49 - samples/sec: 1650.42 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:56:09,696 epoch 8 - iter 154/773 - loss 0.00668992 - time (sec): 14.75 - samples/sec: 1715.41 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:56:16,634 epoch 8 - iter 231/773 - loss 0.00632277 - time (sec): 21.68 - samples/sec: 1711.90 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:56:24,003 epoch 8 - iter 308/773 - loss 0.00710856 - time (sec): 29.05 - samples/sec: 1703.90 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:56:31,183 epoch 8 - iter 385/773 - loss 0.00694469 - time (sec): 36.23 - samples/sec: 1720.14 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:56:38,232 epoch 8 - iter 462/773 - loss 0.00715802 - time (sec): 43.28 - samples/sec: 1733.62 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:56:45,558 epoch 8 - iter 539/773 - loss 0.00732958 - time (sec): 50.61 - samples/sec: 1721.98 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:56:52,703 epoch 8 - iter 616/773 - loss 0.00761398 - time (sec): 57.75 - samples/sec: 1717.23 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:57:00,365 epoch 8 - iter 693/773 - loss 0.00808954 - time (sec): 65.42 - samples/sec: 1707.19 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:57:07,847 epoch 8 - iter 770/773 - loss 0.00789069 - time (sec): 72.90 - samples/sec: 1697.25 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:57:08,147 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:08,147 EPOCH 8 done: loss 0.0079 - lr: 0.000007 2023-10-17 10:57:11,061 DEV : loss 0.11245165020227432 - f1-score (micro avg) 0.8065 2023-10-17 10:57:11,090 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:17,927 epoch 9 - iter 77/773 - loss 0.00678704 - time (sec): 6.83 - samples/sec: 1792.37 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:57:25,076 epoch 9 - iter 154/773 - loss 0.00603493 - time (sec): 13.98 - samples/sec: 1728.89 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:57:32,293 epoch 9 - iter 231/773 - loss 0.00487800 - time (sec): 21.20 - samples/sec: 1781.00 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:57:40,077 epoch 9 - iter 308/773 - loss 0.00497749 - time (sec): 28.98 - samples/sec: 1707.92 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:57:47,281 epoch 9 - iter 385/773 - loss 0.00481061 - time (sec): 36.19 - samples/sec: 1699.34 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:57:54,356 epoch 9 - iter 462/773 - loss 0.00526176 - time (sec): 43.26 - samples/sec: 1696.55 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:58:01,333 epoch 9 - iter 539/773 - loss 0.00508493 - time (sec): 50.24 - samples/sec: 1702.12 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:58:08,962 epoch 9 - iter 616/773 - loss 0.00501247 - time (sec): 57.87 - samples/sec: 1715.29 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:58:15,614 epoch 9 - iter 693/773 - loss 0.00488110 - time (sec): 64.52 - samples/sec: 1722.18 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:58:22,667 epoch 9 - iter 770/773 - loss 0.00486167 - time (sec): 71.58 - samples/sec: 1732.27 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:58:22,949 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:58:22,950 EPOCH 9 done: loss 0.0049 - lr: 0.000003 2023-10-17 10:58:26,006 DEV : loss 0.1249605268239975 - f1-score (micro avg) 0.7984 2023-10-17 10:58:26,042 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:58:33,332 epoch 10 - iter 77/773 - loss 0.00242796 - time (sec): 7.29 - samples/sec: 1792.63 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:58:40,248 epoch 10 - iter 154/773 - loss 0.00282294 - time (sec): 14.20 - samples/sec: 1729.26 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:58:47,465 epoch 10 - iter 231/773 - loss 0.00368862 - time (sec): 21.42 - samples/sec: 1696.13 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:58:54,604 epoch 10 - iter 308/773 - loss 0.00321279 - time (sec): 28.56 - samples/sec: 1716.83 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:59:01,489 epoch 10 - iter 385/773 - loss 0.00288216 - time (sec): 35.44 - samples/sec: 1725.88 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:59:08,465 epoch 10 - iter 462/773 - loss 0.00268120 - time (sec): 42.42 - samples/sec: 1726.17 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:59:15,643 epoch 10 - iter 539/773 - loss 0.00271947 - time (sec): 49.60 - samples/sec: 1736.60 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:59:22,844 epoch 10 - iter 616/773 - loss 0.00268237 - time (sec): 56.80 - samples/sec: 1764.79 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:59:29,999 epoch 10 - iter 693/773 - loss 0.00287023 - time (sec): 63.95 - samples/sec: 1747.91 - lr: 0.000000 - momentum: 0.000000 2023-10-17 10:59:37,487 epoch 10 - iter 770/773 - loss 0.00290057 - time (sec): 71.44 - samples/sec: 1734.34 - lr: 0.000000 - momentum: 0.000000 2023-10-17 10:59:37,759 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:59:37,759 EPOCH 10 done: loss 0.0029 - lr: 0.000000 2023-10-17 10:59:40,948 DEV : loss 0.13062351942062378 - f1-score (micro avg) 0.8041 2023-10-17 10:59:41,616 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:59:41,618 Loading model from best epoch ... 2023-10-17 10:59:44,193 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-17 10:59:53,111 Results: - F-score (micro) 0.8174 - F-score (macro) 0.7401 - Accuracy 0.7088 By class: precision recall f1-score support LOC 0.8804 0.8404 0.8599 946 BUILDING 0.6604 0.5676 0.6105 185 STREET 0.7500 0.7500 0.7500 56 micro avg 0.8426 0.7936 0.8174 1187 macro avg 0.7636 0.7193 0.7401 1187 weighted avg 0.8400 0.7936 0.8159 1187 2023-10-17 10:59:53,111 ----------------------------------------------------------------------------------------------------