2023-10-17 19:43:42,471 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:43:42,472 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 19:43:42,472 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:43:42,472 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator 2023-10-17 19:43:42,472 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:43:42,472 Train: 5901 sentences 2023-10-17 19:43:42,472 (train_with_dev=False, train_with_test=False) 2023-10-17 19:43:42,472 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:43:42,472 Training Params: 2023-10-17 19:43:42,473 - learning_rate: "3e-05" 2023-10-17 19:43:42,473 - mini_batch_size: "8" 2023-10-17 19:43:42,473 - max_epochs: "10" 2023-10-17 19:43:42,473 - shuffle: "True" 2023-10-17 19:43:42,473 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:43:42,473 Plugins: 2023-10-17 19:43:42,473 - TensorboardLogger 2023-10-17 19:43:42,473 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 19:43:42,473 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:43:42,473 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 19:43:42,473 - metric: "('micro avg', 'f1-score')" 2023-10-17 19:43:42,473 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:43:42,473 Computation: 2023-10-17 19:43:42,473 - compute on device: cuda:0 2023-10-17 19:43:42,473 - embedding storage: none 2023-10-17 19:43:42,473 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:43:42,473 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 19:43:42,473 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:43:42,473 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:43:42,473 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 19:43:48,751 epoch 1 - iter 73/738 - loss 3.18129576 - time (sec): 6.28 - samples/sec: 2801.48 - lr: 0.000003 - momentum: 0.000000 2023-10-17 19:43:53,251 epoch 1 - iter 146/738 - loss 2.19020696 - time (sec): 10.78 - samples/sec: 3050.59 - lr: 0.000006 - momentum: 0.000000 2023-10-17 19:43:59,111 epoch 1 - iter 219/738 - loss 1.59513825 - time (sec): 16.64 - samples/sec: 3079.50 - lr: 0.000009 - momentum: 0.000000 2023-10-17 19:44:05,045 epoch 1 - iter 292/738 - loss 1.27764894 - time (sec): 22.57 - samples/sec: 3059.98 - lr: 0.000012 - momentum: 0.000000 2023-10-17 19:44:10,208 epoch 1 - iter 365/738 - loss 1.09648871 - time (sec): 27.73 - samples/sec: 3067.33 - lr: 0.000015 - momentum: 0.000000 2023-10-17 19:44:14,907 epoch 1 - iter 438/738 - loss 0.97918922 - time (sec): 32.43 - samples/sec: 3080.33 - lr: 0.000018 - momentum: 0.000000 2023-10-17 19:44:19,805 epoch 1 - iter 511/738 - loss 0.88580507 - time (sec): 37.33 - samples/sec: 3085.82 - lr: 0.000021 - momentum: 0.000000 2023-10-17 19:44:24,890 epoch 1 - iter 584/738 - loss 0.80554273 - time (sec): 42.42 - samples/sec: 3097.71 - lr: 0.000024 - momentum: 0.000000 2023-10-17 19:44:30,040 epoch 1 - iter 657/738 - loss 0.73810377 - time (sec): 47.57 - samples/sec: 3101.81 - lr: 0.000027 - momentum: 0.000000 2023-10-17 19:44:35,093 epoch 1 - iter 730/738 - loss 0.67997669 - time (sec): 52.62 - samples/sec: 3129.86 - lr: 0.000030 - momentum: 0.000000 2023-10-17 19:44:35,557 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:44:35,557 EPOCH 1 done: loss 0.6742 - lr: 0.000030 2023-10-17 19:44:41,721 DEV : loss 0.12336786091327667 - f1-score (micro avg) 0.7553 2023-10-17 19:44:41,759 saving best model 2023-10-17 19:44:42,193 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:44:48,004 epoch 2 - iter 73/738 - loss 0.15982108 - time (sec): 5.81 - samples/sec: 2868.17 - lr: 0.000030 - momentum: 0.000000 2023-10-17 19:44:53,384 epoch 2 - iter 146/738 - loss 0.15219823 - time (sec): 11.19 - samples/sec: 3108.03 - lr: 0.000029 - momentum: 0.000000 2023-10-17 19:44:58,736 epoch 2 - iter 219/738 - loss 0.14354631 - time (sec): 16.54 - samples/sec: 3140.24 - lr: 0.000029 - momentum: 0.000000 2023-10-17 19:45:03,750 epoch 2 - iter 292/738 - loss 0.13970635 - time (sec): 21.55 - samples/sec: 3123.54 - lr: 0.000029 - momentum: 0.000000 2023-10-17 19:45:08,549 epoch 2 - iter 365/738 - loss 0.13505059 - time (sec): 26.35 - samples/sec: 3108.53 - lr: 0.000028 - momentum: 0.000000 2023-10-17 19:45:13,319 epoch 2 - iter 438/738 - loss 0.13319422 - time (sec): 31.12 - samples/sec: 3120.46 - lr: 0.000028 - momentum: 0.000000 2023-10-17 19:45:18,429 epoch 2 - iter 511/738 - loss 0.12880087 - time (sec): 36.23 - samples/sec: 3131.97 - lr: 0.000028 - momentum: 0.000000 2023-10-17 19:45:23,596 epoch 2 - iter 584/738 - loss 0.12583910 - time (sec): 41.40 - samples/sec: 3127.20 - lr: 0.000027 - momentum: 0.000000 2023-10-17 19:45:29,201 epoch 2 - iter 657/738 - loss 0.12490350 - time (sec): 47.01 - samples/sec: 3139.13 - lr: 0.000027 - momentum: 0.000000 2023-10-17 19:45:34,718 epoch 2 - iter 730/738 - loss 0.12241339 - time (sec): 52.52 - samples/sec: 3133.31 - lr: 0.000027 - momentum: 0.000000 2023-10-17 19:45:35,417 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:45:35,417 EPOCH 2 done: loss 0.1220 - lr: 0.000027 2023-10-17 19:45:46,937 DEV : loss 0.09578868746757507 - f1-score (micro avg) 0.8146 2023-10-17 19:45:46,965 saving best model 2023-10-17 19:45:47,493 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:45:53,270 epoch 3 - iter 73/738 - loss 0.07059181 - time (sec): 5.77 - samples/sec: 3049.57 - lr: 0.000026 - momentum: 0.000000 2023-10-17 19:45:58,525 epoch 3 - iter 146/738 - loss 0.07590889 - time (sec): 11.03 - samples/sec: 3161.90 - lr: 0.000026 - momentum: 0.000000 2023-10-17 19:46:03,634 epoch 3 - iter 219/738 - loss 0.07091362 - time (sec): 16.14 - samples/sec: 3199.78 - lr: 0.000026 - momentum: 0.000000 2023-10-17 19:46:08,697 epoch 3 - iter 292/738 - loss 0.06883331 - time (sec): 21.20 - samples/sec: 3180.84 - lr: 0.000025 - momentum: 0.000000 2023-10-17 19:46:13,735 epoch 3 - iter 365/738 - loss 0.07059965 - time (sec): 26.24 - samples/sec: 3176.95 - lr: 0.000025 - momentum: 0.000000 2023-10-17 19:46:18,955 epoch 3 - iter 438/738 - loss 0.07205663 - time (sec): 31.46 - samples/sec: 3143.20 - lr: 0.000025 - momentum: 0.000000 2023-10-17 19:46:24,569 epoch 3 - iter 511/738 - loss 0.07270546 - time (sec): 37.07 - samples/sec: 3157.14 - lr: 0.000024 - momentum: 0.000000 2023-10-17 19:46:29,790 epoch 3 - iter 584/738 - loss 0.07234630 - time (sec): 42.29 - samples/sec: 3145.32 - lr: 0.000024 - momentum: 0.000000 2023-10-17 19:46:34,895 epoch 3 - iter 657/738 - loss 0.07186579 - time (sec): 47.40 - samples/sec: 3143.08 - lr: 0.000024 - momentum: 0.000000 2023-10-17 19:46:39,719 epoch 3 - iter 730/738 - loss 0.07250374 - time (sec): 52.22 - samples/sec: 3159.36 - lr: 0.000023 - momentum: 0.000000 2023-10-17 19:46:40,195 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:46:40,195 EPOCH 3 done: loss 0.0722 - lr: 0.000023 2023-10-17 19:46:51,335 DEV : loss 0.10583800822496414 - f1-score (micro avg) 0.833 2023-10-17 19:46:51,364 saving best model 2023-10-17 19:46:51,881 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:46:57,112 epoch 4 - iter 73/738 - loss 0.05206689 - time (sec): 5.23 - samples/sec: 3037.15 - lr: 0.000023 - momentum: 0.000000 2023-10-17 19:47:02,454 epoch 4 - iter 146/738 - loss 0.04622094 - time (sec): 10.57 - samples/sec: 3169.54 - lr: 0.000023 - momentum: 0.000000 2023-10-17 19:47:07,164 epoch 4 - iter 219/738 - loss 0.05037936 - time (sec): 15.28 - samples/sec: 3196.93 - lr: 0.000022 - momentum: 0.000000 2023-10-17 19:47:12,269 epoch 4 - iter 292/738 - loss 0.05234898 - time (sec): 20.39 - samples/sec: 3188.25 - lr: 0.000022 - momentum: 0.000000 2023-10-17 19:47:17,049 epoch 4 - iter 365/738 - loss 0.05275237 - time (sec): 25.17 - samples/sec: 3168.35 - lr: 0.000022 - momentum: 0.000000 2023-10-17 19:47:22,024 epoch 4 - iter 438/738 - loss 0.05040956 - time (sec): 30.14 - samples/sec: 3198.64 - lr: 0.000021 - momentum: 0.000000 2023-10-17 19:47:26,845 epoch 4 - iter 511/738 - loss 0.04932344 - time (sec): 34.96 - samples/sec: 3209.89 - lr: 0.000021 - momentum: 0.000000 2023-10-17 19:47:32,525 epoch 4 - iter 584/738 - loss 0.04781680 - time (sec): 40.64 - samples/sec: 3196.99 - lr: 0.000021 - momentum: 0.000000 2023-10-17 19:47:37,836 epoch 4 - iter 657/738 - loss 0.04819721 - time (sec): 45.95 - samples/sec: 3181.08 - lr: 0.000020 - momentum: 0.000000 2023-10-17 19:47:43,898 epoch 4 - iter 730/738 - loss 0.04886480 - time (sec): 52.02 - samples/sec: 3167.06 - lr: 0.000020 - momentum: 0.000000 2023-10-17 19:47:44,407 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:47:44,407 EPOCH 4 done: loss 0.0485 - lr: 0.000020 2023-10-17 19:47:55,491 DEV : loss 0.122743621468544 - f1-score (micro avg) 0.8485 2023-10-17 19:47:55,519 saving best model 2023-10-17 19:47:56,054 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:48:01,493 epoch 5 - iter 73/738 - loss 0.02654138 - time (sec): 5.44 - samples/sec: 3265.24 - lr: 0.000020 - momentum: 0.000000 2023-10-17 19:48:06,555 epoch 5 - iter 146/738 - loss 0.03073291 - time (sec): 10.50 - samples/sec: 3185.99 - lr: 0.000019 - momentum: 0.000000 2023-10-17 19:48:11,730 epoch 5 - iter 219/738 - loss 0.03108548 - time (sec): 15.67 - samples/sec: 3161.54 - lr: 0.000019 - momentum: 0.000000 2023-10-17 19:48:16,908 epoch 5 - iter 292/738 - loss 0.03530107 - time (sec): 20.85 - samples/sec: 3182.99 - lr: 0.000019 - momentum: 0.000000 2023-10-17 19:48:21,927 epoch 5 - iter 365/738 - loss 0.03345830 - time (sec): 25.87 - samples/sec: 3204.30 - lr: 0.000018 - momentum: 0.000000 2023-10-17 19:48:27,157 epoch 5 - iter 438/738 - loss 0.03329768 - time (sec): 31.10 - samples/sec: 3199.94 - lr: 0.000018 - momentum: 0.000000 2023-10-17 19:48:32,613 epoch 5 - iter 511/738 - loss 0.03272234 - time (sec): 36.56 - samples/sec: 3155.39 - lr: 0.000018 - momentum: 0.000000 2023-10-17 19:48:37,419 epoch 5 - iter 584/738 - loss 0.03290570 - time (sec): 41.36 - samples/sec: 3159.03 - lr: 0.000017 - momentum: 0.000000 2023-10-17 19:48:42,520 epoch 5 - iter 657/738 - loss 0.03309942 - time (sec): 46.46 - samples/sec: 3171.10 - lr: 0.000017 - momentum: 0.000000 2023-10-17 19:48:47,829 epoch 5 - iter 730/738 - loss 0.03334347 - time (sec): 51.77 - samples/sec: 3170.29 - lr: 0.000017 - momentum: 0.000000 2023-10-17 19:48:48,658 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:48:48,658 EPOCH 5 done: loss 0.0335 - lr: 0.000017 2023-10-17 19:48:59,704 DEV : loss 0.15168806910514832 - f1-score (micro avg) 0.8543 2023-10-17 19:48:59,733 saving best model 2023-10-17 19:49:00,281 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:49:05,290 epoch 6 - iter 73/738 - loss 0.03010752 - time (sec): 5.01 - samples/sec: 3140.40 - lr: 0.000016 - momentum: 0.000000 2023-10-17 19:49:10,877 epoch 6 - iter 146/738 - loss 0.02499735 - time (sec): 10.59 - samples/sec: 3101.45 - lr: 0.000016 - momentum: 0.000000 2023-10-17 19:49:16,327 epoch 6 - iter 219/738 - loss 0.02175671 - time (sec): 16.05 - samples/sec: 3082.57 - lr: 0.000016 - momentum: 0.000000 2023-10-17 19:49:21,571 epoch 6 - iter 292/738 - loss 0.02484654 - time (sec): 21.29 - samples/sec: 3068.11 - lr: 0.000015 - momentum: 0.000000 2023-10-17 19:49:26,919 epoch 6 - iter 365/738 - loss 0.02430030 - time (sec): 26.64 - samples/sec: 3058.19 - lr: 0.000015 - momentum: 0.000000 2023-10-17 19:49:32,199 epoch 6 - iter 438/738 - loss 0.02418786 - time (sec): 31.92 - samples/sec: 3041.81 - lr: 0.000015 - momentum: 0.000000 2023-10-17 19:49:37,745 epoch 6 - iter 511/738 - loss 0.02390168 - time (sec): 37.46 - samples/sec: 3036.04 - lr: 0.000014 - momentum: 0.000000 2023-10-17 19:49:42,959 epoch 6 - iter 584/738 - loss 0.02289941 - time (sec): 42.68 - samples/sec: 3066.21 - lr: 0.000014 - momentum: 0.000000 2023-10-17 19:49:48,171 epoch 6 - iter 657/738 - loss 0.02342068 - time (sec): 47.89 - samples/sec: 3066.73 - lr: 0.000014 - momentum: 0.000000 2023-10-17 19:49:53,684 epoch 6 - iter 730/738 - loss 0.02460435 - time (sec): 53.40 - samples/sec: 3080.94 - lr: 0.000013 - momentum: 0.000000 2023-10-17 19:49:54,408 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:49:54,408 EPOCH 6 done: loss 0.0246 - lr: 0.000013 2023-10-17 19:50:05,542 DEV : loss 0.15729978680610657 - f1-score (micro avg) 0.8547 2023-10-17 19:50:05,573 saving best model 2023-10-17 19:50:06,189 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:50:11,582 epoch 7 - iter 73/738 - loss 0.01280464 - time (sec): 5.39 - samples/sec: 3121.20 - lr: 0.000013 - momentum: 0.000000 2023-10-17 19:50:16,753 epoch 7 - iter 146/738 - loss 0.01448297 - time (sec): 10.56 - samples/sec: 3123.91 - lr: 0.000013 - momentum: 0.000000 2023-10-17 19:50:22,379 epoch 7 - iter 219/738 - loss 0.01626473 - time (sec): 16.18 - samples/sec: 3133.78 - lr: 0.000012 - momentum: 0.000000 2023-10-17 19:50:27,912 epoch 7 - iter 292/738 - loss 0.01707587 - time (sec): 21.72 - samples/sec: 3140.88 - lr: 0.000012 - momentum: 0.000000 2023-10-17 19:50:33,061 epoch 7 - iter 365/738 - loss 0.01952204 - time (sec): 26.87 - samples/sec: 3121.60 - lr: 0.000012 - momentum: 0.000000 2023-10-17 19:50:38,280 epoch 7 - iter 438/738 - loss 0.01973386 - time (sec): 32.09 - samples/sec: 3134.20 - lr: 0.000011 - momentum: 0.000000 2023-10-17 19:50:43,038 epoch 7 - iter 511/738 - loss 0.01921562 - time (sec): 36.84 - samples/sec: 3158.94 - lr: 0.000011 - momentum: 0.000000 2023-10-17 19:50:48,334 epoch 7 - iter 584/738 - loss 0.01911006 - time (sec): 42.14 - samples/sec: 3166.35 - lr: 0.000011 - momentum: 0.000000 2023-10-17 19:50:53,630 epoch 7 - iter 657/738 - loss 0.01943647 - time (sec): 47.44 - samples/sec: 3163.17 - lr: 0.000010 - momentum: 0.000000 2023-10-17 19:50:58,375 epoch 7 - iter 730/738 - loss 0.01835721 - time (sec): 52.18 - samples/sec: 3160.70 - lr: 0.000010 - momentum: 0.000000 2023-10-17 19:50:58,850 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:50:58,850 EPOCH 7 done: loss 0.0183 - lr: 0.000010 2023-10-17 19:51:09,961 DEV : loss 0.18084457516670227 - f1-score (micro avg) 0.8646 2023-10-17 19:51:09,994 saving best model 2023-10-17 19:51:10,719 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:51:15,879 epoch 8 - iter 73/738 - loss 0.00764996 - time (sec): 5.16 - samples/sec: 3152.05 - lr: 0.000010 - momentum: 0.000000 2023-10-17 19:51:22,133 epoch 8 - iter 146/738 - loss 0.01104963 - time (sec): 11.41 - samples/sec: 2973.21 - lr: 0.000009 - momentum: 0.000000 2023-10-17 19:51:27,099 epoch 8 - iter 219/738 - loss 0.01198235 - time (sec): 16.38 - samples/sec: 3023.19 - lr: 0.000009 - momentum: 0.000000 2023-10-17 19:51:32,039 epoch 8 - iter 292/738 - loss 0.01113988 - time (sec): 21.32 - samples/sec: 3063.31 - lr: 0.000009 - momentum: 0.000000 2023-10-17 19:51:37,404 epoch 8 - iter 365/738 - loss 0.01216710 - time (sec): 26.68 - samples/sec: 3071.41 - lr: 0.000008 - momentum: 0.000000 2023-10-17 19:51:42,200 epoch 8 - iter 438/738 - loss 0.01124920 - time (sec): 31.48 - samples/sec: 3103.15 - lr: 0.000008 - momentum: 0.000000 2023-10-17 19:51:48,209 epoch 8 - iter 511/738 - loss 0.01270444 - time (sec): 37.49 - samples/sec: 3113.35 - lr: 0.000008 - momentum: 0.000000 2023-10-17 19:51:53,103 epoch 8 - iter 584/738 - loss 0.01213295 - time (sec): 42.38 - samples/sec: 3133.78 - lr: 0.000007 - momentum: 0.000000 2023-10-17 19:51:58,181 epoch 8 - iter 657/738 - loss 0.01173209 - time (sec): 47.46 - samples/sec: 3143.35 - lr: 0.000007 - momentum: 0.000000 2023-10-17 19:52:03,086 epoch 8 - iter 730/738 - loss 0.01264595 - time (sec): 52.37 - samples/sec: 3142.90 - lr: 0.000007 - momentum: 0.000000 2023-10-17 19:52:03,702 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:52:03,702 EPOCH 8 done: loss 0.0126 - lr: 0.000007 2023-10-17 19:52:15,277 DEV : loss 0.19484567642211914 - f1-score (micro avg) 0.8602 2023-10-17 19:52:15,312 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:52:20,982 epoch 9 - iter 73/738 - loss 0.00488093 - time (sec): 5.67 - samples/sec: 3162.54 - lr: 0.000006 - momentum: 0.000000 2023-10-17 19:52:26,399 epoch 9 - iter 146/738 - loss 0.00805781 - time (sec): 11.09 - samples/sec: 3264.48 - lr: 0.000006 - momentum: 0.000000 2023-10-17 19:52:32,156 epoch 9 - iter 219/738 - loss 0.00865579 - time (sec): 16.84 - samples/sec: 3235.87 - lr: 0.000006 - momentum: 0.000000 2023-10-17 19:52:37,449 epoch 9 - iter 292/738 - loss 0.00824878 - time (sec): 22.14 - samples/sec: 3143.10 - lr: 0.000005 - momentum: 0.000000 2023-10-17 19:52:42,684 epoch 9 - iter 365/738 - loss 0.00755746 - time (sec): 27.37 - samples/sec: 3121.26 - lr: 0.000005 - momentum: 0.000000 2023-10-17 19:52:47,474 epoch 9 - iter 438/738 - loss 0.00745069 - time (sec): 32.16 - samples/sec: 3162.12 - lr: 0.000005 - momentum: 0.000000 2023-10-17 19:52:52,032 epoch 9 - iter 511/738 - loss 0.00907204 - time (sec): 36.72 - samples/sec: 3177.35 - lr: 0.000004 - momentum: 0.000000 2023-10-17 19:52:56,789 epoch 9 - iter 584/738 - loss 0.00911722 - time (sec): 41.48 - samples/sec: 3186.87 - lr: 0.000004 - momentum: 0.000000 2023-10-17 19:53:02,404 epoch 9 - iter 657/738 - loss 0.00897147 - time (sec): 47.09 - samples/sec: 3187.92 - lr: 0.000004 - momentum: 0.000000 2023-10-17 19:53:06,847 epoch 9 - iter 730/738 - loss 0.00954450 - time (sec): 51.53 - samples/sec: 3191.40 - lr: 0.000003 - momentum: 0.000000 2023-10-17 19:53:07,402 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:53:07,402 EPOCH 9 done: loss 0.0095 - lr: 0.000003 2023-10-17 19:53:18,725 DEV : loss 0.1866709142923355 - f1-score (micro avg) 0.8599 2023-10-17 19:53:18,756 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:53:24,448 epoch 10 - iter 73/738 - loss 0.00755432 - time (sec): 5.69 - samples/sec: 3440.94 - lr: 0.000003 - momentum: 0.000000 2023-10-17 19:53:29,523 epoch 10 - iter 146/738 - loss 0.00658582 - time (sec): 10.77 - samples/sec: 3419.43 - lr: 0.000003 - momentum: 0.000000 2023-10-17 19:53:34,251 epoch 10 - iter 219/738 - loss 0.00573246 - time (sec): 15.49 - samples/sec: 3424.02 - lr: 0.000002 - momentum: 0.000000 2023-10-17 19:53:38,964 epoch 10 - iter 292/738 - loss 0.00503723 - time (sec): 20.21 - samples/sec: 3347.86 - lr: 0.000002 - momentum: 0.000000 2023-10-17 19:53:43,952 epoch 10 - iter 365/738 - loss 0.00546895 - time (sec): 25.19 - samples/sec: 3299.09 - lr: 0.000002 - momentum: 0.000000 2023-10-17 19:53:48,719 epoch 10 - iter 438/738 - loss 0.00520885 - time (sec): 29.96 - samples/sec: 3314.36 - lr: 0.000001 - momentum: 0.000000 2023-10-17 19:53:54,231 epoch 10 - iter 511/738 - loss 0.00544848 - time (sec): 35.47 - samples/sec: 3288.43 - lr: 0.000001 - momentum: 0.000000 2023-10-17 19:53:58,768 epoch 10 - iter 584/738 - loss 0.00613301 - time (sec): 40.01 - samples/sec: 3293.93 - lr: 0.000001 - momentum: 0.000000 2023-10-17 19:54:03,447 epoch 10 - iter 657/738 - loss 0.00650287 - time (sec): 44.69 - samples/sec: 3300.55 - lr: 0.000000 - momentum: 0.000000 2023-10-17 19:54:08,736 epoch 10 - iter 730/738 - loss 0.00664178 - time (sec): 49.98 - samples/sec: 3299.34 - lr: 0.000000 - momentum: 0.000000 2023-10-17 19:54:09,185 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:54:09,185 EPOCH 10 done: loss 0.0067 - lr: 0.000000 2023-10-17 19:54:20,252 DEV : loss 0.1855737715959549 - f1-score (micro avg) 0.8652 2023-10-17 19:54:20,280 saving best model 2023-10-17 19:54:21,202 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:54:21,204 Loading model from best epoch ... 2023-10-17 19:54:23,111 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod 2023-10-17 19:54:29,618 Results: - F-score (micro) 0.8059 - F-score (macro) 0.7164 - Accuracy 0.6929 By class: precision recall f1-score support loc 0.8570 0.8800 0.8683 858 pers 0.7792 0.8082 0.7934 537 org 0.5467 0.6212 0.5816 132 prod 0.7679 0.7049 0.7350 61 time 0.5645 0.6481 0.6034 54 micro avg 0.7907 0.8216 0.8059 1642 macro avg 0.7030 0.7325 0.7164 1642 weighted avg 0.7937 0.8216 0.8071 1642 2023-10-17 19:54:29,618 ----------------------------------------------------------------------------------------------------