2023-10-18 00:31:48,574 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:31:48,576 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 00:31:48,576 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:31:48,577 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-18 00:31:48,577 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:31:48,577 Train: 20847 sentences 2023-10-18 00:31:48,577 (train_with_dev=False, train_with_test=False) 2023-10-18 00:31:48,577 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:31:48,577 Training Params: 2023-10-18 00:31:48,577 - learning_rate: "5e-05" 2023-10-18 00:31:48,577 - mini_batch_size: "4" 2023-10-18 00:31:48,577 - max_epochs: "10" 2023-10-18 00:31:48,577 - shuffle: "True" 2023-10-18 00:31:48,577 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:31:48,577 Plugins: 2023-10-18 00:31:48,578 - TensorboardLogger 2023-10-18 00:31:48,578 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 00:31:48,578 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:31:48,578 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 00:31:48,578 - metric: "('micro avg', 'f1-score')" 2023-10-18 00:31:48,578 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:31:48,578 Computation: 2023-10-18 00:31:48,578 - compute on device: cuda:0 2023-10-18 00:31:48,578 - embedding storage: none 2023-10-18 00:31:48,578 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:31:48,578 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-18 00:31:48,578 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:31:48,578 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:31:48,579 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 00:32:29,918 epoch 1 - iter 521/5212 - loss 1.66906812 - time (sec): 41.34 - samples/sec: 930.71 - lr: 0.000005 - momentum: 0.000000 2023-10-18 00:33:10,635 epoch 1 - iter 1042/5212 - loss 1.03719361 - time (sec): 82.05 - samples/sec: 909.14 - lr: 0.000010 - momentum: 0.000000 2023-10-18 00:33:51,571 epoch 1 - iter 1563/5212 - loss 0.79777765 - time (sec): 122.99 - samples/sec: 892.20 - lr: 0.000015 - momentum: 0.000000 2023-10-18 00:34:33,364 epoch 1 - iter 2084/5212 - loss 0.66095486 - time (sec): 164.78 - samples/sec: 895.33 - lr: 0.000020 - momentum: 0.000000 2023-10-18 00:35:14,940 epoch 1 - iter 2605/5212 - loss 0.58101719 - time (sec): 206.36 - samples/sec: 885.33 - lr: 0.000025 - momentum: 0.000000 2023-10-18 00:35:56,337 epoch 1 - iter 3126/5212 - loss 0.52357318 - time (sec): 247.76 - samples/sec: 882.27 - lr: 0.000030 - momentum: 0.000000 2023-10-18 00:36:37,849 epoch 1 - iter 3647/5212 - loss 0.47988816 - time (sec): 289.27 - samples/sec: 891.55 - lr: 0.000035 - momentum: 0.000000 2023-10-18 00:37:19,590 epoch 1 - iter 4168/5212 - loss 0.45555744 - time (sec): 331.01 - samples/sec: 894.11 - lr: 0.000040 - momentum: 0.000000 2023-10-18 00:38:01,595 epoch 1 - iter 4689/5212 - loss 0.43054978 - time (sec): 373.02 - samples/sec: 889.91 - lr: 0.000045 - momentum: 0.000000 2023-10-18 00:38:42,289 epoch 1 - iter 5210/5212 - loss 0.41213934 - time (sec): 413.71 - samples/sec: 887.73 - lr: 0.000050 - momentum: 0.000000 2023-10-18 00:38:42,441 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:38:42,442 EPOCH 1 done: loss 0.4120 - lr: 0.000050 2023-10-18 00:38:50,396 DEV : loss 0.16113218665122986 - f1-score (micro avg) 0.3046 2023-10-18 00:38:50,458 saving best model 2023-10-18 00:38:51,056 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:39:33,061 epoch 2 - iter 521/5212 - loss 0.20391832 - time (sec): 42.00 - samples/sec: 861.95 - lr: 0.000049 - momentum: 0.000000 2023-10-18 00:40:16,191 epoch 2 - iter 1042/5212 - loss 0.20211911 - time (sec): 85.13 - samples/sec: 855.50 - lr: 0.000049 - momentum: 0.000000 2023-10-18 00:40:57,547 epoch 2 - iter 1563/5212 - loss 0.19999558 - time (sec): 126.49 - samples/sec: 860.72 - lr: 0.000048 - momentum: 0.000000 2023-10-18 00:41:39,640 epoch 2 - iter 2084/5212 - loss 0.20513933 - time (sec): 168.58 - samples/sec: 854.18 - lr: 0.000048 - momentum: 0.000000 2023-10-18 00:42:21,573 epoch 2 - iter 2605/5212 - loss 0.20038286 - time (sec): 210.51 - samples/sec: 863.40 - lr: 0.000047 - momentum: 0.000000 2023-10-18 00:43:02,141 epoch 2 - iter 3126/5212 - loss 0.19864827 - time (sec): 251.08 - samples/sec: 869.12 - lr: 0.000047 - momentum: 0.000000 2023-10-18 00:43:42,673 epoch 2 - iter 3647/5212 - loss 0.19854493 - time (sec): 291.61 - samples/sec: 867.61 - lr: 0.000046 - momentum: 0.000000 2023-10-18 00:44:24,671 epoch 2 - iter 4168/5212 - loss 0.19715426 - time (sec): 333.61 - samples/sec: 878.42 - lr: 0.000046 - momentum: 0.000000 2023-10-18 00:45:07,789 epoch 2 - iter 4689/5212 - loss 0.19743047 - time (sec): 376.73 - samples/sec: 881.73 - lr: 0.000045 - momentum: 0.000000 2023-10-18 00:45:48,782 epoch 2 - iter 5210/5212 - loss 0.19494220 - time (sec): 417.72 - samples/sec: 879.26 - lr: 0.000044 - momentum: 0.000000 2023-10-18 00:45:48,936 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:45:48,936 EPOCH 2 done: loss 0.1949 - lr: 0.000044 2023-10-18 00:46:01,264 DEV : loss 0.1894654929637909 - f1-score (micro avg) 0.3482 2023-10-18 00:46:01,333 saving best model 2023-10-18 00:46:02,756 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:46:44,151 epoch 3 - iter 521/5212 - loss 0.15457492 - time (sec): 41.39 - samples/sec: 873.70 - lr: 0.000044 - momentum: 0.000000 2023-10-18 00:47:25,991 epoch 3 - iter 1042/5212 - loss 0.14322200 - time (sec): 83.23 - samples/sec: 885.31 - lr: 0.000043 - momentum: 0.000000 2023-10-18 00:48:06,252 epoch 3 - iter 1563/5212 - loss 0.14225428 - time (sec): 123.49 - samples/sec: 875.22 - lr: 0.000043 - momentum: 0.000000 2023-10-18 00:48:48,040 epoch 3 - iter 2084/5212 - loss 0.14483257 - time (sec): 165.28 - samples/sec: 888.19 - lr: 0.000042 - momentum: 0.000000 2023-10-18 00:49:30,204 epoch 3 - iter 2605/5212 - loss 0.14346341 - time (sec): 207.44 - samples/sec: 881.51 - lr: 0.000042 - momentum: 0.000000 2023-10-18 00:50:12,207 epoch 3 - iter 3126/5212 - loss 0.14044970 - time (sec): 249.45 - samples/sec: 882.67 - lr: 0.000041 - momentum: 0.000000 2023-10-18 00:50:54,038 epoch 3 - iter 3647/5212 - loss 0.14195401 - time (sec): 291.28 - samples/sec: 887.79 - lr: 0.000041 - momentum: 0.000000 2023-10-18 00:51:36,494 epoch 3 - iter 4168/5212 - loss 0.14437001 - time (sec): 333.73 - samples/sec: 882.83 - lr: 0.000040 - momentum: 0.000000 2023-10-18 00:52:18,181 epoch 3 - iter 4689/5212 - loss 0.14670163 - time (sec): 375.42 - samples/sec: 877.15 - lr: 0.000039 - momentum: 0.000000 2023-10-18 00:53:00,008 epoch 3 - iter 5210/5212 - loss 0.14610716 - time (sec): 417.25 - samples/sec: 880.17 - lr: 0.000039 - momentum: 0.000000 2023-10-18 00:53:00,154 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:53:00,155 EPOCH 3 done: loss 0.1461 - lr: 0.000039 2023-10-18 00:53:11,968 DEV : loss 0.20233668386936188 - f1-score (micro avg) 0.3894 2023-10-18 00:53:12,019 saving best model 2023-10-18 00:53:13,437 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:53:55,104 epoch 4 - iter 521/5212 - loss 0.11142981 - time (sec): 41.66 - samples/sec: 895.43 - lr: 0.000038 - momentum: 0.000000 2023-10-18 00:54:39,019 epoch 4 - iter 1042/5212 - loss 0.10715561 - time (sec): 85.58 - samples/sec: 863.93 - lr: 0.000038 - momentum: 0.000000 2023-10-18 00:55:20,313 epoch 4 - iter 1563/5212 - loss 0.11064029 - time (sec): 126.87 - samples/sec: 878.12 - lr: 0.000037 - momentum: 0.000000 2023-10-18 00:56:01,136 epoch 4 - iter 2084/5212 - loss 0.11448970 - time (sec): 167.69 - samples/sec: 872.48 - lr: 0.000037 - momentum: 0.000000 2023-10-18 00:56:41,309 epoch 4 - iter 2605/5212 - loss 0.11567949 - time (sec): 207.87 - samples/sec: 877.17 - lr: 0.000036 - momentum: 0.000000 2023-10-18 00:57:22,189 epoch 4 - iter 3126/5212 - loss 0.11662647 - time (sec): 248.75 - samples/sec: 881.56 - lr: 0.000036 - momentum: 0.000000 2023-10-18 00:58:03,052 epoch 4 - iter 3647/5212 - loss 0.11508943 - time (sec): 289.61 - samples/sec: 883.76 - lr: 0.000035 - momentum: 0.000000 2023-10-18 00:58:45,312 epoch 4 - iter 4168/5212 - loss 0.11455258 - time (sec): 331.87 - samples/sec: 886.66 - lr: 0.000034 - momentum: 0.000000 2023-10-18 00:59:25,409 epoch 4 - iter 4689/5212 - loss 0.11467035 - time (sec): 371.97 - samples/sec: 887.99 - lr: 0.000034 - momentum: 0.000000 2023-10-18 01:00:03,308 epoch 4 - iter 5210/5212 - loss 0.11262872 - time (sec): 409.87 - samples/sec: 896.20 - lr: 0.000033 - momentum: 0.000000 2023-10-18 01:00:03,451 ---------------------------------------------------------------------------------------------------- 2023-10-18 01:00:03,452 EPOCH 4 done: loss 0.1126 - lr: 0.000033 2023-10-18 01:00:15,154 DEV : loss 0.2687111496925354 - f1-score (micro avg) 0.3657 2023-10-18 01:00:15,206 ---------------------------------------------------------------------------------------------------- 2023-10-18 01:00:54,028 epoch 5 - iter 521/5212 - loss 0.07311031 - time (sec): 38.82 - samples/sec: 928.03 - lr: 0.000033 - momentum: 0.000000 2023-10-18 01:01:34,435 epoch 5 - iter 1042/5212 - loss 0.08017798 - time (sec): 79.23 - samples/sec: 944.73 - lr: 0.000032 - momentum: 0.000000 2023-10-18 01:02:15,896 epoch 5 - iter 1563/5212 - loss 0.07619780 - time (sec): 120.69 - samples/sec: 945.18 - lr: 0.000032 - momentum: 0.000000 2023-10-18 01:02:56,684 epoch 5 - iter 2084/5212 - loss 0.07909195 - time (sec): 161.48 - samples/sec: 924.66 - lr: 0.000031 - momentum: 0.000000 2023-10-18 01:03:37,694 epoch 5 - iter 2605/5212 - loss 0.08215076 - time (sec): 202.49 - samples/sec: 918.77 - lr: 0.000031 - momentum: 0.000000 2023-10-18 01:04:18,619 epoch 5 - iter 3126/5212 - loss 0.08334033 - time (sec): 243.41 - samples/sec: 920.85 - lr: 0.000030 - momentum: 0.000000 2023-10-18 01:04:59,547 epoch 5 - iter 3647/5212 - loss 0.08207112 - time (sec): 284.34 - samples/sec: 917.40 - lr: 0.000029 - momentum: 0.000000 2023-10-18 01:05:40,848 epoch 5 - iter 4168/5212 - loss 0.08331101 - time (sec): 325.64 - samples/sec: 913.88 - lr: 0.000029 - momentum: 0.000000 2023-10-18 01:06:20,943 epoch 5 - iter 4689/5212 - loss 0.08261802 - time (sec): 365.73 - samples/sec: 912.70 - lr: 0.000028 - momentum: 0.000000 2023-10-18 01:07:01,129 epoch 5 - iter 5210/5212 - loss 0.08358709 - time (sec): 405.92 - samples/sec: 905.02 - lr: 0.000028 - momentum: 0.000000 2023-10-18 01:07:01,270 ---------------------------------------------------------------------------------------------------- 2023-10-18 01:07:01,271 EPOCH 5 done: loss 0.0836 - lr: 0.000028 2023-10-18 01:07:12,187 DEV : loss 0.301369845867157 - f1-score (micro avg) 0.3742 2023-10-18 01:07:12,250 ---------------------------------------------------------------------------------------------------- 2023-10-18 01:07:54,992 epoch 6 - iter 521/5212 - loss 0.06866396 - time (sec): 42.74 - samples/sec: 896.81 - lr: 0.000027 - momentum: 0.000000 2023-10-18 01:08:36,567 epoch 6 - iter 1042/5212 - loss 0.06896073 - time (sec): 84.31 - samples/sec: 882.54 - lr: 0.000027 - momentum: 0.000000 2023-10-18 01:09:16,846 epoch 6 - iter 1563/5212 - loss 0.06751622 - time (sec): 124.59 - samples/sec: 906.02 - lr: 0.000026 - momentum: 0.000000 2023-10-18 01:09:57,387 epoch 6 - iter 2084/5212 - loss 0.06455834 - time (sec): 165.13 - samples/sec: 914.12 - lr: 0.000026 - momentum: 0.000000 2023-10-18 01:10:38,142 epoch 6 - iter 2605/5212 - loss 0.06669693 - time (sec): 205.89 - samples/sec: 905.24 - lr: 0.000025 - momentum: 0.000000 2023-10-18 01:11:18,567 epoch 6 - iter 3126/5212 - loss 0.06865838 - time (sec): 246.31 - samples/sec: 891.57 - lr: 0.000024 - momentum: 0.000000 2023-10-18 01:12:00,376 epoch 6 - iter 3647/5212 - loss 0.06873425 - time (sec): 288.12 - samples/sec: 886.06 - lr: 0.000024 - momentum: 0.000000 2023-10-18 01:12:43,940 epoch 6 - iter 4168/5212 - loss 0.06803666 - time (sec): 331.69 - samples/sec: 877.70 - lr: 0.000023 - momentum: 0.000000 2023-10-18 01:13:25,958 epoch 6 - iter 4689/5212 - loss 0.06603815 - time (sec): 373.71 - samples/sec: 881.12 - lr: 0.000023 - momentum: 0.000000 2023-10-18 01:14:07,418 epoch 6 - iter 5210/5212 - loss 0.06522923 - time (sec): 415.17 - samples/sec: 884.91 - lr: 0.000022 - momentum: 0.000000 2023-10-18 01:14:07,588 ---------------------------------------------------------------------------------------------------- 2023-10-18 01:14:07,588 EPOCH 6 done: loss 0.0652 - lr: 0.000022 2023-10-18 01:14:18,954 DEV : loss 0.36671698093414307 - f1-score (micro avg) 0.3549 2023-10-18 01:14:19,009 ---------------------------------------------------------------------------------------------------- 2023-10-18 01:15:01,826 epoch 7 - iter 521/5212 - loss 0.03359695 - time (sec): 42.81 - samples/sec: 885.06 - lr: 0.000022 - momentum: 0.000000 2023-10-18 01:15:44,187 epoch 7 - iter 1042/5212 - loss 0.03709266 - time (sec): 85.17 - samples/sec: 872.76 - lr: 0.000021 - momentum: 0.000000 2023-10-18 01:16:24,595 epoch 7 - iter 1563/5212 - loss 0.04253411 - time (sec): 125.58 - samples/sec: 864.90 - lr: 0.000021 - momentum: 0.000000 2023-10-18 01:17:05,737 epoch 7 - iter 2084/5212 - loss 0.04345550 - time (sec): 166.73 - samples/sec: 867.32 - lr: 0.000020 - momentum: 0.000000 2023-10-18 01:17:47,521 epoch 7 - iter 2605/5212 - loss 0.04796044 - time (sec): 208.51 - samples/sec: 871.07 - lr: 0.000019 - momentum: 0.000000 2023-10-18 01:18:28,730 epoch 7 - iter 3126/5212 - loss 0.04973625 - time (sec): 249.72 - samples/sec: 866.88 - lr: 0.000019 - momentum: 0.000000 2023-10-18 01:19:10,790 epoch 7 - iter 3647/5212 - loss 0.04988937 - time (sec): 291.78 - samples/sec: 871.16 - lr: 0.000018 - momentum: 0.000000 2023-10-18 01:19:53,123 epoch 7 - iter 4168/5212 - loss 0.04833380 - time (sec): 334.11 - samples/sec: 886.96 - lr: 0.000018 - momentum: 0.000000 2023-10-18 01:20:33,853 epoch 7 - iter 4689/5212 - loss 0.04815110 - time (sec): 374.84 - samples/sec: 881.91 - lr: 0.000017 - momentum: 0.000000 2023-10-18 01:21:14,383 epoch 7 - iter 5210/5212 - loss 0.04638672 - time (sec): 415.37 - samples/sec: 884.32 - lr: 0.000017 - momentum: 0.000000 2023-10-18 01:21:14,530 ---------------------------------------------------------------------------------------------------- 2023-10-18 01:21:14,530 EPOCH 7 done: loss 0.0464 - lr: 0.000017 2023-10-18 01:21:25,727 DEV : loss 0.43325743079185486 - f1-score (micro avg) 0.3587 2023-10-18 01:21:25,783 ---------------------------------------------------------------------------------------------------- 2023-10-18 01:22:07,559 epoch 8 - iter 521/5212 - loss 0.04888283 - time (sec): 41.77 - samples/sec: 868.38 - lr: 0.000016 - momentum: 0.000000 2023-10-18 01:22:51,475 epoch 8 - iter 1042/5212 - loss 0.05588160 - time (sec): 85.69 - samples/sec: 853.90 - lr: 0.000016 - momentum: 0.000000 2023-10-18 01:23:30,474 epoch 8 - iter 1563/5212 - loss 0.04998288 - time (sec): 124.69 - samples/sec: 860.88 - lr: 0.000015 - momentum: 0.000000 2023-10-18 01:24:11,992 epoch 8 - iter 2084/5212 - loss 0.04736604 - time (sec): 166.21 - samples/sec: 860.69 - lr: 0.000014 - momentum: 0.000000 2023-10-18 01:24:54,218 epoch 8 - iter 2605/5212 - loss 0.04423360 - time (sec): 208.43 - samples/sec: 861.12 - lr: 0.000014 - momentum: 0.000000 2023-10-18 01:25:36,727 epoch 8 - iter 3126/5212 - loss 0.04220802 - time (sec): 250.94 - samples/sec: 867.37 - lr: 0.000013 - momentum: 0.000000 2023-10-18 01:26:21,477 epoch 8 - iter 3647/5212 - loss 0.03990034 - time (sec): 295.69 - samples/sec: 871.35 - lr: 0.000013 - momentum: 0.000000 2023-10-18 01:27:07,438 epoch 8 - iter 4168/5212 - loss 0.03988533 - time (sec): 341.65 - samples/sec: 864.12 - lr: 0.000012 - momentum: 0.000000 2023-10-18 01:27:50,105 epoch 8 - iter 4689/5212 - loss 0.03847361 - time (sec): 384.32 - samples/sec: 859.02 - lr: 0.000012 - momentum: 0.000000 2023-10-18 01:28:33,305 epoch 8 - iter 5210/5212 - loss 0.03727021 - time (sec): 427.52 - samples/sec: 858.96 - lr: 0.000011 - momentum: 0.000000 2023-10-18 01:28:33,460 ---------------------------------------------------------------------------------------------------- 2023-10-18 01:28:33,460 EPOCH 8 done: loss 0.0373 - lr: 0.000011 2023-10-18 01:28:44,615 DEV : loss 0.4129716157913208 - f1-score (micro avg) 0.3669 2023-10-18 01:28:44,677 ---------------------------------------------------------------------------------------------------- 2023-10-18 01:29:29,811 epoch 9 - iter 521/5212 - loss 0.01869215 - time (sec): 45.13 - samples/sec: 865.28 - lr: 0.000011 - momentum: 0.000000 2023-10-18 01:30:11,850 epoch 9 - iter 1042/5212 - loss 0.02072276 - time (sec): 87.17 - samples/sec: 861.21 - lr: 0.000010 - momentum: 0.000000 2023-10-18 01:30:55,037 epoch 9 - iter 1563/5212 - loss 0.02156431 - time (sec): 130.36 - samples/sec: 845.78 - lr: 0.000009 - momentum: 0.000000 2023-10-18 01:31:38,177 epoch 9 - iter 2084/5212 - loss 0.02107749 - time (sec): 173.50 - samples/sec: 833.83 - lr: 0.000009 - momentum: 0.000000 2023-10-18 01:32:20,284 epoch 9 - iter 2605/5212 - loss 0.02033570 - time (sec): 215.60 - samples/sec: 833.24 - lr: 0.000008 - momentum: 0.000000 2023-10-18 01:33:00,969 epoch 9 - iter 3126/5212 - loss 0.02020637 - time (sec): 256.29 - samples/sec: 847.42 - lr: 0.000008 - momentum: 0.000000 2023-10-18 01:33:41,854 epoch 9 - iter 3647/5212 - loss 0.02018397 - time (sec): 297.17 - samples/sec: 856.84 - lr: 0.000007 - momentum: 0.000000 2023-10-18 01:34:24,795 epoch 9 - iter 4168/5212 - loss 0.01967850 - time (sec): 340.12 - samples/sec: 863.34 - lr: 0.000007 - momentum: 0.000000 2023-10-18 01:35:06,699 epoch 9 - iter 4689/5212 - loss 0.01954663 - time (sec): 382.02 - samples/sec: 864.64 - lr: 0.000006 - momentum: 0.000000 2023-10-18 01:35:47,910 epoch 9 - iter 5210/5212 - loss 0.01998115 - time (sec): 423.23 - samples/sec: 867.76 - lr: 0.000006 - momentum: 0.000000 2023-10-18 01:35:48,061 ---------------------------------------------------------------------------------------------------- 2023-10-18 01:35:48,061 EPOCH 9 done: loss 0.0200 - lr: 0.000006 2023-10-18 01:36:00,418 DEV : loss 0.4562455713748932 - f1-score (micro avg) 0.3877 2023-10-18 01:36:00,486 ---------------------------------------------------------------------------------------------------- 2023-10-18 01:36:44,282 epoch 10 - iter 521/5212 - loss 0.01029065 - time (sec): 43.79 - samples/sec: 856.77 - lr: 0.000005 - momentum: 0.000000 2023-10-18 01:37:26,680 epoch 10 - iter 1042/5212 - loss 0.01334978 - time (sec): 86.19 - samples/sec: 866.79 - lr: 0.000004 - momentum: 0.000000 2023-10-18 01:38:06,606 epoch 10 - iter 1563/5212 - loss 0.01386083 - time (sec): 126.12 - samples/sec: 869.64 - lr: 0.000004 - momentum: 0.000000 2023-10-18 01:38:46,290 epoch 10 - iter 2084/5212 - loss 0.01335369 - time (sec): 165.80 - samples/sec: 867.67 - lr: 0.000003 - momentum: 0.000000 2023-10-18 01:39:27,175 epoch 10 - iter 2605/5212 - loss 0.01357185 - time (sec): 206.69 - samples/sec: 888.86 - lr: 0.000003 - momentum: 0.000000 2023-10-18 01:40:07,806 epoch 10 - iter 3126/5212 - loss 0.01324602 - time (sec): 247.32 - samples/sec: 893.64 - lr: 0.000002 - momentum: 0.000000 2023-10-18 01:40:48,240 epoch 10 - iter 3647/5212 - loss 0.01325955 - time (sec): 287.75 - samples/sec: 904.07 - lr: 0.000002 - momentum: 0.000000 2023-10-18 01:41:28,545 epoch 10 - iter 4168/5212 - loss 0.01305894 - time (sec): 328.06 - samples/sec: 901.55 - lr: 0.000001 - momentum: 0.000000 2023-10-18 01:42:08,664 epoch 10 - iter 4689/5212 - loss 0.01288314 - time (sec): 368.18 - samples/sec: 897.88 - lr: 0.000001 - momentum: 0.000000 2023-10-18 01:42:48,729 epoch 10 - iter 5210/5212 - loss 0.01314818 - time (sec): 408.24 - samples/sec: 899.94 - lr: 0.000000 - momentum: 0.000000 2023-10-18 01:42:48,862 ---------------------------------------------------------------------------------------------------- 2023-10-18 01:42:48,862 EPOCH 10 done: loss 0.0131 - lr: 0.000000 2023-10-18 01:43:00,991 DEV : loss 0.47269320487976074 - f1-score (micro avg) 0.3881 2023-10-18 01:43:01,627 ---------------------------------------------------------------------------------------------------- 2023-10-18 01:43:01,629 Loading model from best epoch ... 2023-10-18 01:43:04,487 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-18 01:43:23,620 Results: - F-score (micro) 0.3373 - F-score (macro) 0.219 - Accuracy 0.2047 By class: precision recall f1-score support LOC 0.4867 0.3624 0.4155 1214 PER 0.3677 0.2426 0.2923 808 ORG 0.2075 0.1416 0.1684 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.4088 0.2870 0.3373 2390 macro avg 0.2655 0.1867 0.2190 2390 weighted avg 0.4022 0.2870 0.3347 2390 2023-10-18 01:43:23,620 ----------------------------------------------------------------------------------------------------