2023-10-17 09:16:01,879 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:16:01,881 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 09:16:01,881 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:16:01,881 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-17 09:16:01,881 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:16:01,882 Train: 6183 sentences 2023-10-17 09:16:01,882 (train_with_dev=False, train_with_test=False) 2023-10-17 09:16:01,882 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:16:01,882 Training Params: 2023-10-17 09:16:01,882 - learning_rate: "5e-05" 2023-10-17 09:16:01,882 - mini_batch_size: "4" 2023-10-17 09:16:01,882 - max_epochs: "10" 2023-10-17 09:16:01,882 - shuffle: "True" 2023-10-17 09:16:01,882 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:16:01,882 Plugins: 2023-10-17 09:16:01,882 - TensorboardLogger 2023-10-17 09:16:01,882 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 09:16:01,882 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:16:01,882 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 09:16:01,882 - metric: "('micro avg', 'f1-score')" 2023-10-17 09:16:01,883 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:16:01,883 Computation: 2023-10-17 09:16:01,883 - compute on device: cuda:0 2023-10-17 09:16:01,883 - embedding storage: none 2023-10-17 09:16:01,883 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:16:01,883 Model training base path: "hmbench-topres19th/en-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 09:16:01,883 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:16:01,883 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:16:01,883 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 09:16:13,984 epoch 1 - iter 154/1546 - loss 1.62924967 - time (sec): 12.10 - samples/sec: 1062.47 - lr: 0.000005 - momentum: 0.000000 2023-10-17 09:16:25,906 epoch 1 - iter 308/1546 - loss 0.93248566 - time (sec): 24.02 - samples/sec: 1044.33 - lr: 0.000010 - momentum: 0.000000 2023-10-17 09:16:37,922 epoch 1 - iter 462/1546 - loss 0.66723251 - time (sec): 36.04 - samples/sec: 1038.33 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:16:49,947 epoch 1 - iter 616/1546 - loss 0.52961124 - time (sec): 48.06 - samples/sec: 1048.97 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:17:01,946 epoch 1 - iter 770/1546 - loss 0.44851569 - time (sec): 60.06 - samples/sec: 1042.97 - lr: 0.000025 - momentum: 0.000000 2023-10-17 09:17:14,007 epoch 1 - iter 924/1546 - loss 0.39387565 - time (sec): 72.12 - samples/sec: 1038.88 - lr: 0.000030 - momentum: 0.000000 2023-10-17 09:17:26,317 epoch 1 - iter 1078/1546 - loss 0.35997284 - time (sec): 84.43 - samples/sec: 1027.07 - lr: 0.000035 - momentum: 0.000000 2023-10-17 09:17:39,074 epoch 1 - iter 1232/1546 - loss 0.33539440 - time (sec): 97.19 - samples/sec: 1016.65 - lr: 0.000040 - momentum: 0.000000 2023-10-17 09:17:51,486 epoch 1 - iter 1386/1546 - loss 0.30892754 - time (sec): 109.60 - samples/sec: 1018.04 - lr: 0.000045 - momentum: 0.000000 2023-10-17 09:18:03,504 epoch 1 - iter 1540/1546 - loss 0.28846921 - time (sec): 121.62 - samples/sec: 1019.55 - lr: 0.000050 - momentum: 0.000000 2023-10-17 09:18:03,959 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:18:03,959 EPOCH 1 done: loss 0.2880 - lr: 0.000050 2023-10-17 09:18:06,548 DEV : loss 0.062463369220495224 - f1-score (micro avg) 0.7284 2023-10-17 09:18:06,576 saving best model 2023-10-17 09:18:07,127 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:18:19,061 epoch 2 - iter 154/1546 - loss 0.11824926 - time (sec): 11.93 - samples/sec: 990.45 - lr: 0.000049 - momentum: 0.000000 2023-10-17 09:18:31,040 epoch 2 - iter 308/1546 - loss 0.09859653 - time (sec): 23.91 - samples/sec: 1010.34 - lr: 0.000049 - momentum: 0.000000 2023-10-17 09:18:43,141 epoch 2 - iter 462/1546 - loss 0.09381658 - time (sec): 36.01 - samples/sec: 1046.89 - lr: 0.000048 - momentum: 0.000000 2023-10-17 09:18:55,259 epoch 2 - iter 616/1546 - loss 0.09522061 - time (sec): 48.13 - samples/sec: 1041.58 - lr: 0.000048 - momentum: 0.000000 2023-10-17 09:19:07,386 epoch 2 - iter 770/1546 - loss 0.09398797 - time (sec): 60.26 - samples/sec: 1041.37 - lr: 0.000047 - momentum: 0.000000 2023-10-17 09:19:19,779 epoch 2 - iter 924/1546 - loss 0.09339793 - time (sec): 72.65 - samples/sec: 1031.34 - lr: 0.000047 - momentum: 0.000000 2023-10-17 09:19:32,797 epoch 2 - iter 1078/1546 - loss 0.09259153 - time (sec): 85.67 - samples/sec: 1020.93 - lr: 0.000046 - momentum: 0.000000 2023-10-17 09:19:44,967 epoch 2 - iter 1232/1546 - loss 0.09239772 - time (sec): 97.84 - samples/sec: 1027.57 - lr: 0.000046 - momentum: 0.000000 2023-10-17 09:19:57,504 epoch 2 - iter 1386/1546 - loss 0.09117852 - time (sec): 110.37 - samples/sec: 1016.56 - lr: 0.000045 - momentum: 0.000000 2023-10-17 09:20:10,053 epoch 2 - iter 1540/1546 - loss 0.09196043 - time (sec): 122.92 - samples/sec: 1008.94 - lr: 0.000044 - momentum: 0.000000 2023-10-17 09:20:10,529 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:20:10,529 EPOCH 2 done: loss 0.0921 - lr: 0.000044 2023-10-17 09:20:13,490 DEV : loss 0.07139772176742554 - f1-score (micro avg) 0.7 2023-10-17 09:20:13,524 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:20:25,161 epoch 3 - iter 154/1546 - loss 0.05594334 - time (sec): 11.63 - samples/sec: 1004.75 - lr: 0.000044 - momentum: 0.000000 2023-10-17 09:20:37,872 epoch 3 - iter 308/1546 - loss 0.06152451 - time (sec): 24.35 - samples/sec: 1019.88 - lr: 0.000043 - momentum: 0.000000 2023-10-17 09:20:50,458 epoch 3 - iter 462/1546 - loss 0.05868671 - time (sec): 36.93 - samples/sec: 1033.95 - lr: 0.000043 - momentum: 0.000000 2023-10-17 09:21:02,558 epoch 3 - iter 616/1546 - loss 0.05626465 - time (sec): 49.03 - samples/sec: 1031.87 - lr: 0.000042 - momentum: 0.000000 2023-10-17 09:21:14,660 epoch 3 - iter 770/1546 - loss 0.05831441 - time (sec): 61.13 - samples/sec: 1021.45 - lr: 0.000042 - momentum: 0.000000 2023-10-17 09:21:26,729 epoch 3 - iter 924/1546 - loss 0.05999687 - time (sec): 73.20 - samples/sec: 1025.56 - lr: 0.000041 - momentum: 0.000000 2023-10-17 09:21:38,836 epoch 3 - iter 1078/1546 - loss 0.06078623 - time (sec): 85.31 - samples/sec: 1021.69 - lr: 0.000041 - momentum: 0.000000 2023-10-17 09:21:51,064 epoch 3 - iter 1232/1546 - loss 0.06071517 - time (sec): 97.54 - samples/sec: 1020.41 - lr: 0.000040 - momentum: 0.000000 2023-10-17 09:22:03,166 epoch 3 - iter 1386/1546 - loss 0.06341036 - time (sec): 109.64 - samples/sec: 1006.31 - lr: 0.000039 - momentum: 0.000000 2023-10-17 09:22:15,420 epoch 3 - iter 1540/1546 - loss 0.06258378 - time (sec): 121.89 - samples/sec: 1016.44 - lr: 0.000039 - momentum: 0.000000 2023-10-17 09:22:15,882 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:22:15,882 EPOCH 3 done: loss 0.0627 - lr: 0.000039 2023-10-17 09:22:18,763 DEV : loss 0.06976697593927383 - f1-score (micro avg) 0.7767 2023-10-17 09:22:18,792 saving best model 2023-10-17 09:22:20,230 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:22:32,554 epoch 4 - iter 154/1546 - loss 0.04523365 - time (sec): 12.32 - samples/sec: 1043.60 - lr: 0.000038 - momentum: 0.000000 2023-10-17 09:22:45,142 epoch 4 - iter 308/1546 - loss 0.03926806 - time (sec): 24.91 - samples/sec: 984.68 - lr: 0.000038 - momentum: 0.000000 2023-10-17 09:22:57,852 epoch 4 - iter 462/1546 - loss 0.04050801 - time (sec): 37.62 - samples/sec: 997.23 - lr: 0.000037 - momentum: 0.000000 2023-10-17 09:23:10,024 epoch 4 - iter 616/1546 - loss 0.03878561 - time (sec): 49.79 - samples/sec: 1004.29 - lr: 0.000037 - momentum: 0.000000 2023-10-17 09:23:22,388 epoch 4 - iter 770/1546 - loss 0.03865470 - time (sec): 62.15 - samples/sec: 1004.24 - lr: 0.000036 - momentum: 0.000000 2023-10-17 09:23:35,333 epoch 4 - iter 924/1546 - loss 0.04069375 - time (sec): 75.10 - samples/sec: 1002.81 - lr: 0.000036 - momentum: 0.000000 2023-10-17 09:23:47,453 epoch 4 - iter 1078/1546 - loss 0.04126939 - time (sec): 87.22 - samples/sec: 1009.12 - lr: 0.000035 - momentum: 0.000000 2023-10-17 09:24:00,928 epoch 4 - iter 1232/1546 - loss 0.04052158 - time (sec): 100.69 - samples/sec: 990.16 - lr: 0.000034 - momentum: 0.000000 2023-10-17 09:24:13,374 epoch 4 - iter 1386/1546 - loss 0.04157212 - time (sec): 113.14 - samples/sec: 985.57 - lr: 0.000034 - momentum: 0.000000 2023-10-17 09:24:25,720 epoch 4 - iter 1540/1546 - loss 0.04184639 - time (sec): 125.49 - samples/sec: 987.77 - lr: 0.000033 - momentum: 0.000000 2023-10-17 09:24:26,182 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:24:26,182 EPOCH 4 done: loss 0.0420 - lr: 0.000033 2023-10-17 09:24:29,002 DEV : loss 0.08950287848711014 - f1-score (micro avg) 0.7621 2023-10-17 09:24:29,033 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:24:40,952 epoch 5 - iter 154/1546 - loss 0.03009039 - time (sec): 11.92 - samples/sec: 995.54 - lr: 0.000033 - momentum: 0.000000 2023-10-17 09:24:52,812 epoch 5 - iter 308/1546 - loss 0.02873435 - time (sec): 23.78 - samples/sec: 1021.53 - lr: 0.000032 - momentum: 0.000000 2023-10-17 09:25:04,892 epoch 5 - iter 462/1546 - loss 0.02553030 - time (sec): 35.86 - samples/sec: 1008.65 - lr: 0.000032 - momentum: 0.000000 2023-10-17 09:25:16,887 epoch 5 - iter 616/1546 - loss 0.02718396 - time (sec): 47.85 - samples/sec: 1011.63 - lr: 0.000031 - momentum: 0.000000 2023-10-17 09:25:28,726 epoch 5 - iter 770/1546 - loss 0.03075938 - time (sec): 59.69 - samples/sec: 1028.39 - lr: 0.000031 - momentum: 0.000000 2023-10-17 09:25:40,497 epoch 5 - iter 924/1546 - loss 0.02980085 - time (sec): 71.46 - samples/sec: 1035.53 - lr: 0.000030 - momentum: 0.000000 2023-10-17 09:25:52,316 epoch 5 - iter 1078/1546 - loss 0.02958354 - time (sec): 83.28 - samples/sec: 1038.68 - lr: 0.000029 - momentum: 0.000000 2023-10-17 09:26:04,188 epoch 5 - iter 1232/1546 - loss 0.03071438 - time (sec): 95.15 - samples/sec: 1036.96 - lr: 0.000029 - momentum: 0.000000 2023-10-17 09:26:16,141 epoch 5 - iter 1386/1546 - loss 0.03065998 - time (sec): 107.11 - samples/sec: 1044.83 - lr: 0.000028 - momentum: 0.000000 2023-10-17 09:26:27,952 epoch 5 - iter 1540/1546 - loss 0.03163804 - time (sec): 118.92 - samples/sec: 1040.63 - lr: 0.000028 - momentum: 0.000000 2023-10-17 09:26:28,420 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:26:28,421 EPOCH 5 done: loss 0.0318 - lr: 0.000028 2023-10-17 09:26:31,249 DEV : loss 0.09248381108045578 - f1-score (micro avg) 0.7826 2023-10-17 09:26:31,276 saving best model 2023-10-17 09:26:32,670 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:26:44,458 epoch 6 - iter 154/1546 - loss 0.01744772 - time (sec): 11.78 - samples/sec: 1088.41 - lr: 0.000027 - momentum: 0.000000 2023-10-17 09:26:56,154 epoch 6 - iter 308/1546 - loss 0.01475739 - time (sec): 23.48 - samples/sec: 1096.15 - lr: 0.000027 - momentum: 0.000000 2023-10-17 09:27:07,914 epoch 6 - iter 462/1546 - loss 0.01645344 - time (sec): 35.24 - samples/sec: 1079.04 - lr: 0.000026 - momentum: 0.000000 2023-10-17 09:27:20,358 epoch 6 - iter 616/1546 - loss 0.01806831 - time (sec): 47.68 - samples/sec: 1060.34 - lr: 0.000026 - momentum: 0.000000 2023-10-17 09:27:32,424 epoch 6 - iter 770/1546 - loss 0.01828414 - time (sec): 59.75 - samples/sec: 1063.31 - lr: 0.000025 - momentum: 0.000000 2023-10-17 09:27:44,726 epoch 6 - iter 924/1546 - loss 0.01867124 - time (sec): 72.05 - samples/sec: 1042.82 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:27:57,290 epoch 6 - iter 1078/1546 - loss 0.02286106 - time (sec): 84.62 - samples/sec: 1027.03 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:28:09,420 epoch 6 - iter 1232/1546 - loss 0.02306730 - time (sec): 96.75 - samples/sec: 1020.36 - lr: 0.000023 - momentum: 0.000000 2023-10-17 09:28:21,749 epoch 6 - iter 1386/1546 - loss 0.02425800 - time (sec): 109.08 - samples/sec: 1020.97 - lr: 0.000023 - momentum: 0.000000 2023-10-17 09:28:34,208 epoch 6 - iter 1540/1546 - loss 0.02438160 - time (sec): 121.53 - samples/sec: 1019.32 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:28:34,701 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:28:34,701 EPOCH 6 done: loss 0.0243 - lr: 0.000022 2023-10-17 09:28:37,708 DEV : loss 0.11331269890069962 - f1-score (micro avg) 0.7824 2023-10-17 09:28:37,738 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:28:49,752 epoch 7 - iter 154/1546 - loss 0.00853625 - time (sec): 12.01 - samples/sec: 975.35 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:29:02,437 epoch 7 - iter 308/1546 - loss 0.01700246 - time (sec): 24.70 - samples/sec: 961.02 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:29:14,092 epoch 7 - iter 462/1546 - loss 0.01549663 - time (sec): 36.35 - samples/sec: 996.68 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:29:25,932 epoch 7 - iter 616/1546 - loss 0.01582650 - time (sec): 48.19 - samples/sec: 1015.55 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:29:37,848 epoch 7 - iter 770/1546 - loss 0.01646521 - time (sec): 60.11 - samples/sec: 1024.53 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:29:49,378 epoch 7 - iter 924/1546 - loss 0.01629401 - time (sec): 71.64 - samples/sec: 1031.99 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:30:00,738 epoch 7 - iter 1078/1546 - loss 0.01679182 - time (sec): 83.00 - samples/sec: 1035.90 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:30:13,015 epoch 7 - iter 1232/1546 - loss 0.01625462 - time (sec): 95.27 - samples/sec: 1039.96 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:30:24,972 epoch 7 - iter 1386/1546 - loss 0.01691603 - time (sec): 107.23 - samples/sec: 1043.25 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:30:36,848 epoch 7 - iter 1540/1546 - loss 0.01710851 - time (sec): 119.11 - samples/sec: 1038.46 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:30:37,317 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:30:37,317 EPOCH 7 done: loss 0.0171 - lr: 0.000017 2023-10-17 09:30:40,138 DEV : loss 0.10292882472276688 - f1-score (micro avg) 0.7967 2023-10-17 09:30:40,166 saving best model 2023-10-17 09:30:41,561 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:30:53,628 epoch 8 - iter 154/1546 - loss 0.00897999 - time (sec): 12.06 - samples/sec: 1026.13 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:31:06,246 epoch 8 - iter 308/1546 - loss 0.00679301 - time (sec): 24.68 - samples/sec: 1023.58 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:31:18,239 epoch 8 - iter 462/1546 - loss 0.00873332 - time (sec): 36.67 - samples/sec: 1018.98 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:31:30,487 epoch 8 - iter 616/1546 - loss 0.00762723 - time (sec): 48.92 - samples/sec: 1011.60 - lr: 0.000014 - momentum: 0.000000 2023-10-17 09:31:42,740 epoch 8 - iter 770/1546 - loss 0.00715001 - time (sec): 61.17 - samples/sec: 1005.54 - lr: 0.000014 - momentum: 0.000000 2023-10-17 09:31:54,758 epoch 8 - iter 924/1546 - loss 0.00780939 - time (sec): 73.19 - samples/sec: 1020.03 - lr: 0.000013 - momentum: 0.000000 2023-10-17 09:32:06,777 epoch 8 - iter 1078/1546 - loss 0.00748800 - time (sec): 85.21 - samples/sec: 1028.26 - lr: 0.000013 - momentum: 0.000000 2023-10-17 09:32:18,870 epoch 8 - iter 1232/1546 - loss 0.00779496 - time (sec): 97.31 - samples/sec: 1022.41 - lr: 0.000012 - momentum: 0.000000 2023-10-17 09:32:30,904 epoch 8 - iter 1386/1546 - loss 0.00808887 - time (sec): 109.34 - samples/sec: 1014.93 - lr: 0.000012 - momentum: 0.000000 2023-10-17 09:32:42,904 epoch 8 - iter 1540/1546 - loss 0.00813670 - time (sec): 121.34 - samples/sec: 1021.43 - lr: 0.000011 - momentum: 0.000000 2023-10-17 09:32:43,372 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:32:43,372 EPOCH 8 done: loss 0.0081 - lr: 0.000011 2023-10-17 09:32:46,130 DEV : loss 0.14049740135669708 - f1-score (micro avg) 0.767 2023-10-17 09:32:46,160 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:32:58,264 epoch 9 - iter 154/1546 - loss 0.00309967 - time (sec): 12.10 - samples/sec: 1041.71 - lr: 0.000011 - momentum: 0.000000 2023-10-17 09:33:10,247 epoch 9 - iter 308/1546 - loss 0.00466253 - time (sec): 24.08 - samples/sec: 1018.41 - lr: 0.000010 - momentum: 0.000000 2023-10-17 09:33:22,003 epoch 9 - iter 462/1546 - loss 0.00679552 - time (sec): 35.84 - samples/sec: 1043.56 - lr: 0.000009 - momentum: 0.000000 2023-10-17 09:33:33,721 epoch 9 - iter 616/1546 - loss 0.00701922 - time (sec): 47.56 - samples/sec: 1033.03 - lr: 0.000009 - momentum: 0.000000 2023-10-17 09:33:45,568 epoch 9 - iter 770/1546 - loss 0.00656941 - time (sec): 59.41 - samples/sec: 1041.79 - lr: 0.000008 - momentum: 0.000000 2023-10-17 09:33:58,097 epoch 9 - iter 924/1546 - loss 0.00655801 - time (sec): 71.94 - samples/sec: 1028.82 - lr: 0.000008 - momentum: 0.000000 2023-10-17 09:34:10,150 epoch 9 - iter 1078/1546 - loss 0.00604590 - time (sec): 83.99 - samples/sec: 1034.54 - lr: 0.000007 - momentum: 0.000000 2023-10-17 09:34:22,061 epoch 9 - iter 1232/1546 - loss 0.00650962 - time (sec): 95.90 - samples/sec: 1032.50 - lr: 0.000007 - momentum: 0.000000 2023-10-17 09:34:34,171 epoch 9 - iter 1386/1546 - loss 0.00605757 - time (sec): 108.01 - samples/sec: 1039.92 - lr: 0.000006 - momentum: 0.000000 2023-10-17 09:34:46,499 epoch 9 - iter 1540/1546 - loss 0.00602743 - time (sec): 120.34 - samples/sec: 1028.93 - lr: 0.000006 - momentum: 0.000000 2023-10-17 09:34:46,985 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:34:46,985 EPOCH 9 done: loss 0.0060 - lr: 0.000006 2023-10-17 09:34:49,926 DEV : loss 0.12698261439800262 - f1-score (micro avg) 0.8032 2023-10-17 09:34:49,956 saving best model 2023-10-17 09:34:51,408 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:35:04,024 epoch 10 - iter 154/1546 - loss 0.00737061 - time (sec): 12.61 - samples/sec: 992.85 - lr: 0.000005 - momentum: 0.000000 2023-10-17 09:35:16,144 epoch 10 - iter 308/1546 - loss 0.00570858 - time (sec): 24.73 - samples/sec: 1002.41 - lr: 0.000004 - momentum: 0.000000 2023-10-17 09:35:28,174 epoch 10 - iter 462/1546 - loss 0.00405368 - time (sec): 36.76 - samples/sec: 1029.47 - lr: 0.000004 - momentum: 0.000000 2023-10-17 09:35:39,730 epoch 10 - iter 616/1546 - loss 0.00351886 - time (sec): 48.32 - samples/sec: 1040.18 - lr: 0.000003 - momentum: 0.000000 2023-10-17 09:35:51,440 epoch 10 - iter 770/1546 - loss 0.00343175 - time (sec): 60.03 - samples/sec: 1041.22 - lr: 0.000003 - momentum: 0.000000 2023-10-17 09:36:03,143 epoch 10 - iter 924/1546 - loss 0.00370004 - time (sec): 71.73 - samples/sec: 1034.86 - lr: 0.000002 - momentum: 0.000000 2023-10-17 09:36:14,990 epoch 10 - iter 1078/1546 - loss 0.00352942 - time (sec): 83.58 - samples/sec: 1041.32 - lr: 0.000002 - momentum: 0.000000 2023-10-17 09:36:26,661 epoch 10 - iter 1232/1546 - loss 0.00371630 - time (sec): 95.25 - samples/sec: 1037.65 - lr: 0.000001 - momentum: 0.000000 2023-10-17 09:36:38,732 epoch 10 - iter 1386/1546 - loss 0.00370365 - time (sec): 107.32 - samples/sec: 1038.39 - lr: 0.000001 - momentum: 0.000000 2023-10-17 09:36:51,551 epoch 10 - iter 1540/1546 - loss 0.00357901 - time (sec): 120.14 - samples/sec: 1030.81 - lr: 0.000000 - momentum: 0.000000 2023-10-17 09:36:52,006 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:36:52,006 EPOCH 10 done: loss 0.0036 - lr: 0.000000 2023-10-17 09:36:54,823 DEV : loss 0.12681032717227936 - f1-score (micro avg) 0.8109 2023-10-17 09:36:54,853 saving best model 2023-10-17 09:36:56,978 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:36:56,980 Loading model from best epoch ... 2023-10-17 09:36:59,153 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-17 09:37:07,160 Results: - F-score (micro) 0.8205 - F-score (macro) 0.7279 - Accuracy 0.7179 By class: precision recall f1-score support LOC 0.8723 0.8594 0.8658 946 BUILDING 0.6404 0.6162 0.6281 185 STREET 0.6667 0.7143 0.6897 56 micro avg 0.8265 0.8147 0.8205 1187 macro avg 0.7265 0.7300 0.7279 1187 weighted avg 0.8265 0.8147 0.8205 1187 2023-10-17 09:37:07,160 ----------------------------------------------------------------------------------------------------