2023-10-17 11:03:34,264 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:03:34,265 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 11:03:34,265 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:03:34,265 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-17 11:03:34,265 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:03:34,265 Train: 966 sentences 2023-10-17 11:03:34,265 (train_with_dev=False, train_with_test=False) 2023-10-17 11:03:34,265 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:03:34,265 Training Params: 2023-10-17 11:03:34,265 - learning_rate: "5e-05" 2023-10-17 11:03:34,265 - mini_batch_size: "4" 2023-10-17 11:03:34,265 - max_epochs: "10" 2023-10-17 11:03:34,265 - shuffle: "True" 2023-10-17 11:03:34,265 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:03:34,265 Plugins: 2023-10-17 11:03:34,265 - TensorboardLogger 2023-10-17 11:03:34,265 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 11:03:34,265 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:03:34,265 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 11:03:34,265 - metric: "('micro avg', 'f1-score')" 2023-10-17 11:03:34,265 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:03:34,265 Computation: 2023-10-17 11:03:34,265 - compute on device: cuda:0 2023-10-17 11:03:34,265 - embedding storage: none 2023-10-17 11:03:34,265 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:03:34,265 Model training base path: "hmbench-ajmc/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-17 11:03:34,265 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:03:34,266 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:03:34,266 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 11:03:35,366 epoch 1 - iter 24/242 - loss 4.26591789 - time (sec): 1.10 - samples/sec: 2380.20 - lr: 0.000005 - momentum: 0.000000 2023-10-17 11:03:36,465 epoch 1 - iter 48/242 - loss 3.41019533 - time (sec): 2.20 - samples/sec: 2266.95 - lr: 0.000010 - momentum: 0.000000 2023-10-17 11:03:37,565 epoch 1 - iter 72/242 - loss 2.54020713 - time (sec): 3.30 - samples/sec: 2243.57 - lr: 0.000015 - momentum: 0.000000 2023-10-17 11:03:38,672 epoch 1 - iter 96/242 - loss 2.07973132 - time (sec): 4.41 - samples/sec: 2217.33 - lr: 0.000020 - momentum: 0.000000 2023-10-17 11:03:39,823 epoch 1 - iter 120/242 - loss 1.72326029 - time (sec): 5.56 - samples/sec: 2229.69 - lr: 0.000025 - momentum: 0.000000 2023-10-17 11:03:40,931 epoch 1 - iter 144/242 - loss 1.49011539 - time (sec): 6.66 - samples/sec: 2233.87 - lr: 0.000030 - momentum: 0.000000 2023-10-17 11:03:42,016 epoch 1 - iter 168/242 - loss 1.33800204 - time (sec): 7.75 - samples/sec: 2218.61 - lr: 0.000035 - momentum: 0.000000 2023-10-17 11:03:43,114 epoch 1 - iter 192/242 - loss 1.21558630 - time (sec): 8.85 - samples/sec: 2218.97 - lr: 0.000039 - momentum: 0.000000 2023-10-17 11:03:44,208 epoch 1 - iter 216/242 - loss 1.09848132 - time (sec): 9.94 - samples/sec: 2236.96 - lr: 0.000044 - momentum: 0.000000 2023-10-17 11:03:45,306 epoch 1 - iter 240/242 - loss 1.02602552 - time (sec): 11.04 - samples/sec: 2232.12 - lr: 0.000049 - momentum: 0.000000 2023-10-17 11:03:45,389 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:03:45,390 EPOCH 1 done: loss 1.0231 - lr: 0.000049 2023-10-17 11:03:46,255 DEV : loss 0.25234466791152954 - f1-score (micro avg) 0.5514 2023-10-17 11:03:46,260 saving best model 2023-10-17 11:03:46,623 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:03:47,736 epoch 2 - iter 24/242 - loss 0.24499402 - time (sec): 1.11 - samples/sec: 2400.11 - lr: 0.000049 - momentum: 0.000000 2023-10-17 11:03:48,859 epoch 2 - iter 48/242 - loss 0.21928629 - time (sec): 2.23 - samples/sec: 2294.74 - lr: 0.000049 - momentum: 0.000000 2023-10-17 11:03:50,033 epoch 2 - iter 72/242 - loss 0.20007344 - time (sec): 3.41 - samples/sec: 2159.49 - lr: 0.000048 - momentum: 0.000000 2023-10-17 11:03:51,228 epoch 2 - iter 96/242 - loss 0.18547797 - time (sec): 4.60 - samples/sec: 2163.01 - lr: 0.000048 - momentum: 0.000000 2023-10-17 11:03:52,388 epoch 2 - iter 120/242 - loss 0.18150498 - time (sec): 5.76 - samples/sec: 2111.58 - lr: 0.000047 - momentum: 0.000000 2023-10-17 11:03:53,545 epoch 2 - iter 144/242 - loss 0.18045133 - time (sec): 6.92 - samples/sec: 2116.75 - lr: 0.000047 - momentum: 0.000000 2023-10-17 11:03:54,639 epoch 2 - iter 168/242 - loss 0.17645985 - time (sec): 8.01 - samples/sec: 2159.29 - lr: 0.000046 - momentum: 0.000000 2023-10-17 11:03:55,695 epoch 2 - iter 192/242 - loss 0.17004379 - time (sec): 9.07 - samples/sec: 2187.51 - lr: 0.000046 - momentum: 0.000000 2023-10-17 11:03:56,789 epoch 2 - iter 216/242 - loss 0.17005831 - time (sec): 10.16 - samples/sec: 2193.98 - lr: 0.000045 - momentum: 0.000000 2023-10-17 11:03:57,907 epoch 2 - iter 240/242 - loss 0.16684271 - time (sec): 11.28 - samples/sec: 2184.29 - lr: 0.000045 - momentum: 0.000000 2023-10-17 11:03:57,998 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:03:57,998 EPOCH 2 done: loss 0.1668 - lr: 0.000045 2023-10-17 11:03:58,751 DEV : loss 0.14192795753479004 - f1-score (micro avg) 0.7947 2023-10-17 11:03:58,756 saving best model 2023-10-17 11:03:59,217 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:04:00,326 epoch 3 - iter 24/242 - loss 0.07327286 - time (sec): 1.10 - samples/sec: 2059.31 - lr: 0.000044 - momentum: 0.000000 2023-10-17 11:04:01,419 epoch 3 - iter 48/242 - loss 0.10066725 - time (sec): 2.20 - samples/sec: 2172.27 - lr: 0.000043 - momentum: 0.000000 2023-10-17 11:04:02,535 epoch 3 - iter 72/242 - loss 0.09181505 - time (sec): 3.31 - samples/sec: 2218.63 - lr: 0.000043 - momentum: 0.000000 2023-10-17 11:04:03,648 epoch 3 - iter 96/242 - loss 0.09501350 - time (sec): 4.42 - samples/sec: 2218.11 - lr: 0.000042 - momentum: 0.000000 2023-10-17 11:04:04,753 epoch 3 - iter 120/242 - loss 0.10234338 - time (sec): 5.53 - samples/sec: 2213.43 - lr: 0.000042 - momentum: 0.000000 2023-10-17 11:04:05,857 epoch 3 - iter 144/242 - loss 0.10629150 - time (sec): 6.63 - samples/sec: 2197.15 - lr: 0.000041 - momentum: 0.000000 2023-10-17 11:04:06,945 epoch 3 - iter 168/242 - loss 0.10525866 - time (sec): 7.72 - samples/sec: 2179.24 - lr: 0.000041 - momentum: 0.000000 2023-10-17 11:04:08,055 epoch 3 - iter 192/242 - loss 0.10679829 - time (sec): 8.83 - samples/sec: 2179.06 - lr: 0.000040 - momentum: 0.000000 2023-10-17 11:04:09,164 epoch 3 - iter 216/242 - loss 0.10656321 - time (sec): 9.94 - samples/sec: 2199.25 - lr: 0.000040 - momentum: 0.000000 2023-10-17 11:04:10,296 epoch 3 - iter 240/242 - loss 0.10636724 - time (sec): 11.07 - samples/sec: 2218.12 - lr: 0.000039 - momentum: 0.000000 2023-10-17 11:04:10,387 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:04:10,387 EPOCH 3 done: loss 0.1057 - lr: 0.000039 2023-10-17 11:04:11,135 DEV : loss 0.14712044596672058 - f1-score (micro avg) 0.8314 2023-10-17 11:04:11,140 saving best model 2023-10-17 11:04:11,589 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:04:12,712 epoch 4 - iter 24/242 - loss 0.09047471 - time (sec): 1.12 - samples/sec: 2200.83 - lr: 0.000038 - momentum: 0.000000 2023-10-17 11:04:13,808 epoch 4 - iter 48/242 - loss 0.09144691 - time (sec): 2.22 - samples/sec: 2146.82 - lr: 0.000038 - momentum: 0.000000 2023-10-17 11:04:14,909 epoch 4 - iter 72/242 - loss 0.08597404 - time (sec): 3.32 - samples/sec: 2208.50 - lr: 0.000037 - momentum: 0.000000 2023-10-17 11:04:16,012 epoch 4 - iter 96/242 - loss 0.08514137 - time (sec): 4.42 - samples/sec: 2190.43 - lr: 0.000037 - momentum: 0.000000 2023-10-17 11:04:17,125 epoch 4 - iter 120/242 - loss 0.08635037 - time (sec): 5.53 - samples/sec: 2190.42 - lr: 0.000036 - momentum: 0.000000 2023-10-17 11:04:18,247 epoch 4 - iter 144/242 - loss 0.08276915 - time (sec): 6.66 - samples/sec: 2220.87 - lr: 0.000036 - momentum: 0.000000 2023-10-17 11:04:19,337 epoch 4 - iter 168/242 - loss 0.08425930 - time (sec): 7.75 - samples/sec: 2201.43 - lr: 0.000035 - momentum: 0.000000 2023-10-17 11:04:20,442 epoch 4 - iter 192/242 - loss 0.08232385 - time (sec): 8.85 - samples/sec: 2223.37 - lr: 0.000035 - momentum: 0.000000 2023-10-17 11:04:21,589 epoch 4 - iter 216/242 - loss 0.07878861 - time (sec): 10.00 - samples/sec: 2212.48 - lr: 0.000034 - momentum: 0.000000 2023-10-17 11:04:22,701 epoch 4 - iter 240/242 - loss 0.07622216 - time (sec): 11.11 - samples/sec: 2209.64 - lr: 0.000033 - momentum: 0.000000 2023-10-17 11:04:22,794 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:04:22,794 EPOCH 4 done: loss 0.0756 - lr: 0.000033 2023-10-17 11:04:23,542 DEV : loss 0.1917840987443924 - f1-score (micro avg) 0.8536 2023-10-17 11:04:23,547 saving best model 2023-10-17 11:04:24,006 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:04:25,089 epoch 5 - iter 24/242 - loss 0.04282017 - time (sec): 1.08 - samples/sec: 2168.97 - lr: 0.000033 - momentum: 0.000000 2023-10-17 11:04:26,223 epoch 5 - iter 48/242 - loss 0.03474492 - time (sec): 2.22 - samples/sec: 2142.01 - lr: 0.000032 - momentum: 0.000000 2023-10-17 11:04:27,331 epoch 5 - iter 72/242 - loss 0.05560982 - time (sec): 3.32 - samples/sec: 2170.59 - lr: 0.000032 - momentum: 0.000000 2023-10-17 11:04:28,524 epoch 5 - iter 96/242 - loss 0.06150686 - time (sec): 4.52 - samples/sec: 2182.01 - lr: 0.000031 - momentum: 0.000000 2023-10-17 11:04:29,698 epoch 5 - iter 120/242 - loss 0.05854896 - time (sec): 5.69 - samples/sec: 2148.75 - lr: 0.000031 - momentum: 0.000000 2023-10-17 11:04:30,878 epoch 5 - iter 144/242 - loss 0.05729226 - time (sec): 6.87 - samples/sec: 2170.33 - lr: 0.000030 - momentum: 0.000000 2023-10-17 11:04:32,039 epoch 5 - iter 168/242 - loss 0.06214791 - time (sec): 8.03 - samples/sec: 2128.90 - lr: 0.000030 - momentum: 0.000000 2023-10-17 11:04:33,242 epoch 5 - iter 192/242 - loss 0.06295404 - time (sec): 9.23 - samples/sec: 2125.07 - lr: 0.000029 - momentum: 0.000000 2023-10-17 11:04:34,399 epoch 5 - iter 216/242 - loss 0.06339996 - time (sec): 10.39 - samples/sec: 2130.18 - lr: 0.000028 - momentum: 0.000000 2023-10-17 11:04:35,600 epoch 5 - iter 240/242 - loss 0.06031140 - time (sec): 11.59 - samples/sec: 2119.71 - lr: 0.000028 - momentum: 0.000000 2023-10-17 11:04:35,698 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:04:35,699 EPOCH 5 done: loss 0.0599 - lr: 0.000028 2023-10-17 11:04:36,457 DEV : loss 0.19448469579219818 - f1-score (micro avg) 0.8166 2023-10-17 11:04:36,462 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:04:37,561 epoch 6 - iter 24/242 - loss 0.06765826 - time (sec): 1.10 - samples/sec: 2192.66 - lr: 0.000027 - momentum: 0.000000 2023-10-17 11:04:38,658 epoch 6 - iter 48/242 - loss 0.04932059 - time (sec): 2.19 - samples/sec: 2210.51 - lr: 0.000027 - momentum: 0.000000 2023-10-17 11:04:39,785 epoch 6 - iter 72/242 - loss 0.03995058 - time (sec): 3.32 - samples/sec: 2270.16 - lr: 0.000026 - momentum: 0.000000 2023-10-17 11:04:40,890 epoch 6 - iter 96/242 - loss 0.04122071 - time (sec): 4.43 - samples/sec: 2246.95 - lr: 0.000026 - momentum: 0.000000 2023-10-17 11:04:42,014 epoch 6 - iter 120/242 - loss 0.03975160 - time (sec): 5.55 - samples/sec: 2213.48 - lr: 0.000025 - momentum: 0.000000 2023-10-17 11:04:43,111 epoch 6 - iter 144/242 - loss 0.04516966 - time (sec): 6.65 - samples/sec: 2207.39 - lr: 0.000025 - momentum: 0.000000 2023-10-17 11:04:44,223 epoch 6 - iter 168/242 - loss 0.04377767 - time (sec): 7.76 - samples/sec: 2228.30 - lr: 0.000024 - momentum: 0.000000 2023-10-17 11:04:45,348 epoch 6 - iter 192/242 - loss 0.04344615 - time (sec): 8.88 - samples/sec: 2243.63 - lr: 0.000023 - momentum: 0.000000 2023-10-17 11:04:46,506 epoch 6 - iter 216/242 - loss 0.04113733 - time (sec): 10.04 - samples/sec: 2219.75 - lr: 0.000023 - momentum: 0.000000 2023-10-17 11:04:47,597 epoch 6 - iter 240/242 - loss 0.04041261 - time (sec): 11.13 - samples/sec: 2210.87 - lr: 0.000022 - momentum: 0.000000 2023-10-17 11:04:47,688 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:04:47,688 EPOCH 6 done: loss 0.0404 - lr: 0.000022 2023-10-17 11:04:48,452 DEV : loss 0.18923352658748627 - f1-score (micro avg) 0.8365 2023-10-17 11:04:48,457 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:04:49,501 epoch 7 - iter 24/242 - loss 0.04794183 - time (sec): 1.04 - samples/sec: 2103.27 - lr: 0.000022 - momentum: 0.000000 2023-10-17 11:04:50,574 epoch 7 - iter 48/242 - loss 0.03471700 - time (sec): 2.12 - samples/sec: 2124.94 - lr: 0.000021 - momentum: 0.000000 2023-10-17 11:04:51,643 epoch 7 - iter 72/242 - loss 0.03410147 - time (sec): 3.19 - samples/sec: 2234.65 - lr: 0.000021 - momentum: 0.000000 2023-10-17 11:04:52,718 epoch 7 - iter 96/242 - loss 0.03766714 - time (sec): 4.26 - samples/sec: 2263.70 - lr: 0.000020 - momentum: 0.000000 2023-10-17 11:04:53,777 epoch 7 - iter 120/242 - loss 0.03602439 - time (sec): 5.32 - samples/sec: 2273.82 - lr: 0.000020 - momentum: 0.000000 2023-10-17 11:04:54,855 epoch 7 - iter 144/242 - loss 0.03327001 - time (sec): 6.40 - samples/sec: 2332.94 - lr: 0.000019 - momentum: 0.000000 2023-10-17 11:04:55,942 epoch 7 - iter 168/242 - loss 0.02895076 - time (sec): 7.48 - samples/sec: 2341.65 - lr: 0.000018 - momentum: 0.000000 2023-10-17 11:04:57,028 epoch 7 - iter 192/242 - loss 0.02732392 - time (sec): 8.57 - samples/sec: 2336.24 - lr: 0.000018 - momentum: 0.000000 2023-10-17 11:04:58,103 epoch 7 - iter 216/242 - loss 0.02601102 - time (sec): 9.65 - samples/sec: 2320.22 - lr: 0.000017 - momentum: 0.000000 2023-10-17 11:04:59,227 epoch 7 - iter 240/242 - loss 0.02669025 - time (sec): 10.77 - samples/sec: 2286.01 - lr: 0.000017 - momentum: 0.000000 2023-10-17 11:04:59,311 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:04:59,311 EPOCH 7 done: loss 0.0273 - lr: 0.000017 2023-10-17 11:05:00,073 DEV : loss 0.2232140451669693 - f1-score (micro avg) 0.8302 2023-10-17 11:05:00,078 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:05:01,184 epoch 8 - iter 24/242 - loss 0.03013030 - time (sec): 1.10 - samples/sec: 2079.69 - lr: 0.000016 - momentum: 0.000000 2023-10-17 11:05:02,307 epoch 8 - iter 48/242 - loss 0.01856627 - time (sec): 2.23 - samples/sec: 2114.97 - lr: 0.000016 - momentum: 0.000000 2023-10-17 11:05:03,426 epoch 8 - iter 72/242 - loss 0.02299829 - time (sec): 3.35 - samples/sec: 2173.93 - lr: 0.000015 - momentum: 0.000000 2023-10-17 11:05:04,548 epoch 8 - iter 96/242 - loss 0.02070983 - time (sec): 4.47 - samples/sec: 2084.05 - lr: 0.000015 - momentum: 0.000000 2023-10-17 11:05:05,646 epoch 8 - iter 120/242 - loss 0.02096218 - time (sec): 5.57 - samples/sec: 2151.99 - lr: 0.000014 - momentum: 0.000000 2023-10-17 11:05:06,785 epoch 8 - iter 144/242 - loss 0.02221831 - time (sec): 6.71 - samples/sec: 2162.48 - lr: 0.000013 - momentum: 0.000000 2023-10-17 11:05:07,906 epoch 8 - iter 168/242 - loss 0.01966640 - time (sec): 7.83 - samples/sec: 2164.32 - lr: 0.000013 - momentum: 0.000000 2023-10-17 11:05:09,026 epoch 8 - iter 192/242 - loss 0.01911008 - time (sec): 8.95 - samples/sec: 2180.63 - lr: 0.000012 - momentum: 0.000000 2023-10-17 11:05:10,142 epoch 8 - iter 216/242 - loss 0.01800677 - time (sec): 10.06 - samples/sec: 2194.85 - lr: 0.000012 - momentum: 0.000000 2023-10-17 11:05:11,249 epoch 8 - iter 240/242 - loss 0.01835744 - time (sec): 11.17 - samples/sec: 2202.07 - lr: 0.000011 - momentum: 0.000000 2023-10-17 11:05:11,342 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:05:11,342 EPOCH 8 done: loss 0.0192 - lr: 0.000011 2023-10-17 11:05:12,265 DEV : loss 0.21957647800445557 - f1-score (micro avg) 0.8496 2023-10-17 11:05:12,270 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:05:13,374 epoch 9 - iter 24/242 - loss 0.01157107 - time (sec): 1.10 - samples/sec: 2130.60 - lr: 0.000011 - momentum: 0.000000 2023-10-17 11:05:14,500 epoch 9 - iter 48/242 - loss 0.01386120 - time (sec): 2.23 - samples/sec: 2109.03 - lr: 0.000010 - momentum: 0.000000 2023-10-17 11:05:15,612 epoch 9 - iter 72/242 - loss 0.01373285 - time (sec): 3.34 - samples/sec: 2054.51 - lr: 0.000010 - momentum: 0.000000 2023-10-17 11:05:16,727 epoch 9 - iter 96/242 - loss 0.01073783 - time (sec): 4.46 - samples/sec: 2120.42 - lr: 0.000009 - momentum: 0.000000 2023-10-17 11:05:17,797 epoch 9 - iter 120/242 - loss 0.01083165 - time (sec): 5.53 - samples/sec: 2108.84 - lr: 0.000008 - momentum: 0.000000 2023-10-17 11:05:18,848 epoch 9 - iter 144/242 - loss 0.01404710 - time (sec): 6.58 - samples/sec: 2134.07 - lr: 0.000008 - momentum: 0.000000 2023-10-17 11:05:19,935 epoch 9 - iter 168/242 - loss 0.01354793 - time (sec): 7.66 - samples/sec: 2175.60 - lr: 0.000007 - momentum: 0.000000 2023-10-17 11:05:21,014 epoch 9 - iter 192/242 - loss 0.01192714 - time (sec): 8.74 - samples/sec: 2216.11 - lr: 0.000007 - momentum: 0.000000 2023-10-17 11:05:22,142 epoch 9 - iter 216/242 - loss 0.01215783 - time (sec): 9.87 - samples/sec: 2226.83 - lr: 0.000006 - momentum: 0.000000 2023-10-17 11:05:23,243 epoch 9 - iter 240/242 - loss 0.01151007 - time (sec): 10.97 - samples/sec: 2234.28 - lr: 0.000006 - momentum: 0.000000 2023-10-17 11:05:23,341 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:05:23,342 EPOCH 9 done: loss 0.0115 - lr: 0.000006 2023-10-17 11:05:24,137 DEV : loss 0.23295238614082336 - f1-score (micro avg) 0.8439 2023-10-17 11:05:24,152 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:05:25,246 epoch 10 - iter 24/242 - loss 0.00367671 - time (sec): 1.09 - samples/sec: 2188.30 - lr: 0.000005 - momentum: 0.000000 2023-10-17 11:05:26,305 epoch 10 - iter 48/242 - loss 0.00613275 - time (sec): 2.15 - samples/sec: 2317.36 - lr: 0.000005 - momentum: 0.000000 2023-10-17 11:05:27,360 epoch 10 - iter 72/242 - loss 0.00701705 - time (sec): 3.21 - samples/sec: 2270.85 - lr: 0.000004 - momentum: 0.000000 2023-10-17 11:05:28,419 epoch 10 - iter 96/242 - loss 0.01051967 - time (sec): 4.27 - samples/sec: 2284.40 - lr: 0.000003 - momentum: 0.000000 2023-10-17 11:05:29,483 epoch 10 - iter 120/242 - loss 0.00870223 - time (sec): 5.33 - samples/sec: 2300.58 - lr: 0.000003 - momentum: 0.000000 2023-10-17 11:05:30,549 epoch 10 - iter 144/242 - loss 0.00739513 - time (sec): 6.40 - samples/sec: 2296.78 - lr: 0.000002 - momentum: 0.000000 2023-10-17 11:05:31,652 epoch 10 - iter 168/242 - loss 0.00708496 - time (sec): 7.50 - samples/sec: 2271.84 - lr: 0.000002 - momentum: 0.000000 2023-10-17 11:05:32,742 epoch 10 - iter 192/242 - loss 0.00642611 - time (sec): 8.59 - samples/sec: 2279.48 - lr: 0.000001 - momentum: 0.000000 2023-10-17 11:05:33,797 epoch 10 - iter 216/242 - loss 0.00596252 - time (sec): 9.64 - samples/sec: 2296.15 - lr: 0.000001 - momentum: 0.000000 2023-10-17 11:05:34,849 epoch 10 - iter 240/242 - loss 0.00652708 - time (sec): 10.70 - samples/sec: 2299.37 - lr: 0.000000 - momentum: 0.000000 2023-10-17 11:05:34,933 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:05:34,933 EPOCH 10 done: loss 0.0065 - lr: 0.000000 2023-10-17 11:05:35,707 DEV : loss 0.23276284337043762 - f1-score (micro avg) 0.8443 2023-10-17 11:05:36,183 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:05:36,184 Loading model from best epoch ... 2023-10-17 11:05:37,563 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 11:05:38,229 Results: - F-score (micro) 0.779 - F-score (macro) 0.5053 - Accuracy 0.6598 By class: precision recall f1-score support pers 0.8129 0.8129 0.8129 139 scope 0.8394 0.8915 0.8647 129 work 0.5714 0.7500 0.6486 80 loc 1.0000 0.1111 0.2000 9 date 0.0000 0.0000 0.0000 3 micro avg 0.7565 0.8028 0.7790 360 macro avg 0.6448 0.5131 0.5053 360 weighted avg 0.7667 0.8028 0.7729 360 2023-10-17 11:05:38,229 ----------------------------------------------------------------------------------------------------