2023-10-25 14:35:36,306 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:35:36,307 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 14:35:36,307 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:35:36,307 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-25 14:35:36,307 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:35:36,307 Train: 20847 sentences 2023-10-25 14:35:36,307 (train_with_dev=False, train_with_test=False) 2023-10-25 14:35:36,307 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:35:36,307 Training Params: 2023-10-25 14:35:36,307 - learning_rate: "5e-05" 2023-10-25 14:35:36,307 - mini_batch_size: "8" 2023-10-25 14:35:36,307 - max_epochs: "10" 2023-10-25 14:35:36,307 - shuffle: "True" 2023-10-25 14:35:36,307 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:35:36,307 Plugins: 2023-10-25 14:35:36,307 - TensorboardLogger 2023-10-25 14:35:36,307 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 14:35:36,307 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:35:36,307 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 14:35:36,307 - metric: "('micro avg', 'f1-score')" 2023-10-25 14:35:36,307 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:35:36,307 Computation: 2023-10-25 14:35:36,307 - compute on device: cuda:0 2023-10-25 14:35:36,307 - embedding storage: none 2023-10-25 14:35:36,308 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:35:36,308 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-25 14:35:36,308 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:35:36,308 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:35:36,308 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 14:35:50,873 epoch 1 - iter 260/2606 - loss 1.43007414 - time (sec): 14.56 - samples/sec: 2541.19 - lr: 0.000005 - momentum: 0.000000 2023-10-25 14:36:05,195 epoch 1 - iter 520/2606 - loss 0.88642286 - time (sec): 28.89 - samples/sec: 2530.05 - lr: 0.000010 - momentum: 0.000000 2023-10-25 14:36:20,044 epoch 1 - iter 780/2606 - loss 0.68416171 - time (sec): 43.74 - samples/sec: 2544.39 - lr: 0.000015 - momentum: 0.000000 2023-10-25 14:36:34,261 epoch 1 - iter 1040/2606 - loss 0.56767487 - time (sec): 57.95 - samples/sec: 2555.77 - lr: 0.000020 - momentum: 0.000000 2023-10-25 14:36:48,977 epoch 1 - iter 1300/2606 - loss 0.49684606 - time (sec): 72.67 - samples/sec: 2593.07 - lr: 0.000025 - momentum: 0.000000 2023-10-25 14:37:03,212 epoch 1 - iter 1560/2606 - loss 0.44567246 - time (sec): 86.90 - samples/sec: 2589.23 - lr: 0.000030 - momentum: 0.000000 2023-10-25 14:37:17,495 epoch 1 - iter 1820/2606 - loss 0.41053349 - time (sec): 101.19 - samples/sec: 2585.19 - lr: 0.000035 - momentum: 0.000000 2023-10-25 14:37:31,904 epoch 1 - iter 2080/2606 - loss 0.38157875 - time (sec): 115.60 - samples/sec: 2583.32 - lr: 0.000040 - momentum: 0.000000 2023-10-25 14:37:45,329 epoch 1 - iter 2340/2606 - loss 0.36197031 - time (sec): 129.02 - samples/sec: 2572.51 - lr: 0.000045 - momentum: 0.000000 2023-10-25 14:37:59,142 epoch 1 - iter 2600/2606 - loss 0.34509274 - time (sec): 142.83 - samples/sec: 2569.22 - lr: 0.000050 - momentum: 0.000000 2023-10-25 14:37:59,394 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:37:59,395 EPOCH 1 done: loss 0.3449 - lr: 0.000050 2023-10-25 14:38:03,168 DEV : loss 0.14861008524894714 - f1-score (micro avg) 0.3075 2023-10-25 14:38:03,193 saving best model 2023-10-25 14:38:03,720 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:38:17,970 epoch 2 - iter 260/2606 - loss 0.16010954 - time (sec): 14.25 - samples/sec: 2672.86 - lr: 0.000049 - momentum: 0.000000 2023-10-25 14:38:32,344 epoch 2 - iter 520/2606 - loss 0.16216140 - time (sec): 28.62 - samples/sec: 2671.12 - lr: 0.000049 - momentum: 0.000000 2023-10-25 14:38:47,501 epoch 2 - iter 780/2606 - loss 0.16202218 - time (sec): 43.78 - samples/sec: 2591.71 - lr: 0.000048 - momentum: 0.000000 2023-10-25 14:39:01,279 epoch 2 - iter 1040/2606 - loss 0.16269454 - time (sec): 57.56 - samples/sec: 2603.63 - lr: 0.000048 - momentum: 0.000000 2023-10-25 14:39:15,219 epoch 2 - iter 1300/2606 - loss 0.16161422 - time (sec): 71.50 - samples/sec: 2605.20 - lr: 0.000047 - momentum: 0.000000 2023-10-25 14:39:28,712 epoch 2 - iter 1560/2606 - loss 0.16050413 - time (sec): 84.99 - samples/sec: 2609.10 - lr: 0.000047 - momentum: 0.000000 2023-10-25 14:39:43,253 epoch 2 - iter 1820/2606 - loss 0.16221687 - time (sec): 99.53 - samples/sec: 2603.35 - lr: 0.000046 - momentum: 0.000000 2023-10-25 14:39:57,463 epoch 2 - iter 2080/2606 - loss 0.16080811 - time (sec): 113.74 - samples/sec: 2606.42 - lr: 0.000046 - momentum: 0.000000 2023-10-25 14:40:11,416 epoch 2 - iter 2340/2606 - loss 0.15978570 - time (sec): 127.69 - samples/sec: 2582.57 - lr: 0.000045 - momentum: 0.000000 2023-10-25 14:40:26,509 epoch 2 - iter 2600/2606 - loss 0.15840144 - time (sec): 142.79 - samples/sec: 2565.54 - lr: 0.000044 - momentum: 0.000000 2023-10-25 14:40:26,911 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:40:26,911 EPOCH 2 done: loss 0.1582 - lr: 0.000044 2023-10-25 14:40:34,413 DEV : loss 0.2031174749135971 - f1-score (micro avg) 0.3265 2023-10-25 14:40:34,438 saving best model 2023-10-25 14:40:35,160 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:40:49,669 epoch 3 - iter 260/2606 - loss 0.09809148 - time (sec): 14.51 - samples/sec: 2529.64 - lr: 0.000044 - momentum: 0.000000 2023-10-25 14:41:03,475 epoch 3 - iter 520/2606 - loss 0.11472092 - time (sec): 28.31 - samples/sec: 2528.48 - lr: 0.000043 - momentum: 0.000000 2023-10-25 14:41:17,287 epoch 3 - iter 780/2606 - loss 0.11436832 - time (sec): 42.13 - samples/sec: 2553.09 - lr: 0.000043 - momentum: 0.000000 2023-10-25 14:41:31,543 epoch 3 - iter 1040/2606 - loss 0.10975702 - time (sec): 56.38 - samples/sec: 2584.12 - lr: 0.000042 - momentum: 0.000000 2023-10-25 14:41:45,618 epoch 3 - iter 1300/2606 - loss 0.10721993 - time (sec): 70.46 - samples/sec: 2599.38 - lr: 0.000042 - momentum: 0.000000 2023-10-25 14:41:59,386 epoch 3 - iter 1560/2606 - loss 0.11156032 - time (sec): 84.22 - samples/sec: 2602.20 - lr: 0.000041 - momentum: 0.000000 2023-10-25 14:42:13,163 epoch 3 - iter 1820/2606 - loss 0.11371807 - time (sec): 98.00 - samples/sec: 2602.70 - lr: 0.000041 - momentum: 0.000000 2023-10-25 14:42:27,303 epoch 3 - iter 2080/2606 - loss 0.11301284 - time (sec): 112.14 - samples/sec: 2604.51 - lr: 0.000040 - momentum: 0.000000 2023-10-25 14:42:40,919 epoch 3 - iter 2340/2606 - loss 0.11223997 - time (sec): 125.76 - samples/sec: 2599.81 - lr: 0.000039 - momentum: 0.000000 2023-10-25 14:42:55,448 epoch 3 - iter 2600/2606 - loss 0.11074835 - time (sec): 140.29 - samples/sec: 2612.40 - lr: 0.000039 - momentum: 0.000000 2023-10-25 14:42:55,774 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:42:55,774 EPOCH 3 done: loss 0.1106 - lr: 0.000039 2023-10-25 14:43:02,639 DEV : loss 0.19291090965270996 - f1-score (micro avg) 0.3613 2023-10-25 14:43:02,664 saving best model 2023-10-25 14:43:03,328 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:43:17,067 epoch 4 - iter 260/2606 - loss 0.09500081 - time (sec): 13.74 - samples/sec: 2632.02 - lr: 0.000038 - momentum: 0.000000 2023-10-25 14:43:31,086 epoch 4 - iter 520/2606 - loss 0.09830307 - time (sec): 27.76 - samples/sec: 2617.40 - lr: 0.000038 - momentum: 0.000000 2023-10-25 14:43:46,455 epoch 4 - iter 780/2606 - loss 0.09455546 - time (sec): 43.13 - samples/sec: 2531.37 - lr: 0.000037 - momentum: 0.000000 2023-10-25 14:44:00,720 epoch 4 - iter 1040/2606 - loss 0.09159831 - time (sec): 57.39 - samples/sec: 2495.89 - lr: 0.000037 - momentum: 0.000000 2023-10-25 14:44:14,439 epoch 4 - iter 1300/2606 - loss 0.09101211 - time (sec): 71.11 - samples/sec: 2502.90 - lr: 0.000036 - momentum: 0.000000 2023-10-25 14:44:29,101 epoch 4 - iter 1560/2606 - loss 0.08649454 - time (sec): 85.77 - samples/sec: 2547.17 - lr: 0.000036 - momentum: 0.000000 2023-10-25 14:44:42,716 epoch 4 - iter 1820/2606 - loss 0.08545536 - time (sec): 99.39 - samples/sec: 2538.89 - lr: 0.000035 - momentum: 0.000000 2023-10-25 14:44:57,358 epoch 4 - iter 2080/2606 - loss 0.08615483 - time (sec): 114.03 - samples/sec: 2542.81 - lr: 0.000034 - momentum: 0.000000 2023-10-25 14:45:11,633 epoch 4 - iter 2340/2606 - loss 0.08630774 - time (sec): 128.30 - samples/sec: 2544.39 - lr: 0.000034 - momentum: 0.000000 2023-10-25 14:45:26,422 epoch 4 - iter 2600/2606 - loss 0.08626198 - time (sec): 143.09 - samples/sec: 2559.67 - lr: 0.000033 - momentum: 0.000000 2023-10-25 14:45:26,820 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:45:26,820 EPOCH 4 done: loss 0.0863 - lr: 0.000033 2023-10-25 14:45:33,781 DEV : loss 0.2698976397514343 - f1-score (micro avg) 0.3764 2023-10-25 14:45:33,806 saving best model 2023-10-25 14:45:34,465 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:45:48,639 epoch 5 - iter 260/2606 - loss 0.06464820 - time (sec): 14.17 - samples/sec: 2626.43 - lr: 0.000033 - momentum: 0.000000 2023-10-25 14:46:03,035 epoch 5 - iter 520/2606 - loss 0.06110421 - time (sec): 28.57 - samples/sec: 2543.20 - lr: 0.000032 - momentum: 0.000000 2023-10-25 14:46:17,861 epoch 5 - iter 780/2606 - loss 0.06016891 - time (sec): 43.39 - samples/sec: 2548.67 - lr: 0.000032 - momentum: 0.000000 2023-10-25 14:46:32,445 epoch 5 - iter 1040/2606 - loss 0.06173170 - time (sec): 57.98 - samples/sec: 2555.72 - lr: 0.000031 - momentum: 0.000000 2023-10-25 14:46:46,809 epoch 5 - iter 1300/2606 - loss 0.06133749 - time (sec): 72.34 - samples/sec: 2522.61 - lr: 0.000031 - momentum: 0.000000 2023-10-25 14:47:01,071 epoch 5 - iter 1560/2606 - loss 0.06007684 - time (sec): 86.60 - samples/sec: 2530.55 - lr: 0.000030 - momentum: 0.000000 2023-10-25 14:47:15,449 epoch 5 - iter 1820/2606 - loss 0.06078408 - time (sec): 100.98 - samples/sec: 2533.31 - lr: 0.000029 - momentum: 0.000000 2023-10-25 14:47:29,645 epoch 5 - iter 2080/2606 - loss 0.06059075 - time (sec): 115.18 - samples/sec: 2552.45 - lr: 0.000029 - momentum: 0.000000 2023-10-25 14:47:44,001 epoch 5 - iter 2340/2606 - loss 0.06046212 - time (sec): 129.53 - samples/sec: 2566.87 - lr: 0.000028 - momentum: 0.000000 2023-10-25 14:47:57,841 epoch 5 - iter 2600/2606 - loss 0.06127827 - time (sec): 143.37 - samples/sec: 2556.47 - lr: 0.000028 - momentum: 0.000000 2023-10-25 14:47:58,210 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:47:58,210 EPOCH 5 done: loss 0.0613 - lr: 0.000028 2023-10-25 14:48:05,406 DEV : loss 0.3795294165611267 - f1-score (micro avg) 0.3326 2023-10-25 14:48:05,434 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:48:20,290 epoch 6 - iter 260/2606 - loss 0.03491277 - time (sec): 14.86 - samples/sec: 2652.35 - lr: 0.000027 - momentum: 0.000000 2023-10-25 14:48:35,779 epoch 6 - iter 520/2606 - loss 0.04096153 - time (sec): 30.34 - samples/sec: 2591.06 - lr: 0.000027 - momentum: 0.000000 2023-10-25 14:48:49,582 epoch 6 - iter 780/2606 - loss 0.04201657 - time (sec): 44.15 - samples/sec: 2561.47 - lr: 0.000026 - momentum: 0.000000 2023-10-25 14:49:03,570 epoch 6 - iter 1040/2606 - loss 0.04304112 - time (sec): 58.13 - samples/sec: 2565.68 - lr: 0.000026 - momentum: 0.000000 2023-10-25 14:49:17,029 epoch 6 - iter 1300/2606 - loss 0.04336291 - time (sec): 71.59 - samples/sec: 2550.43 - lr: 0.000025 - momentum: 0.000000 2023-10-25 14:49:31,156 epoch 6 - iter 1560/2606 - loss 0.04379894 - time (sec): 85.72 - samples/sec: 2550.97 - lr: 0.000024 - momentum: 0.000000 2023-10-25 14:49:45,284 epoch 6 - iter 1820/2606 - loss 0.04398689 - time (sec): 99.85 - samples/sec: 2566.43 - lr: 0.000024 - momentum: 0.000000 2023-10-25 14:49:59,537 epoch 6 - iter 2080/2606 - loss 0.04462424 - time (sec): 114.10 - samples/sec: 2575.83 - lr: 0.000023 - momentum: 0.000000 2023-10-25 14:50:13,887 epoch 6 - iter 2340/2606 - loss 0.04389891 - time (sec): 128.45 - samples/sec: 2578.71 - lr: 0.000023 - momentum: 0.000000 2023-10-25 14:50:27,662 epoch 6 - iter 2600/2606 - loss 0.04366628 - time (sec): 142.23 - samples/sec: 2572.78 - lr: 0.000022 - momentum: 0.000000 2023-10-25 14:50:28,029 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:50:28,030 EPOCH 6 done: loss 0.0437 - lr: 0.000022 2023-10-25 14:50:34,281 DEV : loss 0.3496846556663513 - f1-score (micro avg) 0.3677 2023-10-25 14:50:34,307 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:50:49,498 epoch 7 - iter 260/2606 - loss 0.03844051 - time (sec): 15.19 - samples/sec: 2291.49 - lr: 0.000022 - momentum: 0.000000 2023-10-25 14:51:04,819 epoch 7 - iter 520/2606 - loss 0.03763631 - time (sec): 30.51 - samples/sec: 2355.19 - lr: 0.000021 - momentum: 0.000000 2023-10-25 14:51:19,867 epoch 7 - iter 780/2606 - loss 0.04085011 - time (sec): 45.56 - samples/sec: 2305.17 - lr: 0.000021 - momentum: 0.000000 2023-10-25 14:51:35,277 epoch 7 - iter 1040/2606 - loss 0.04072254 - time (sec): 60.97 - samples/sec: 2328.77 - lr: 0.000020 - momentum: 0.000000 2023-10-25 14:51:49,777 epoch 7 - iter 1300/2606 - loss 0.04275275 - time (sec): 75.47 - samples/sec: 2367.75 - lr: 0.000019 - momentum: 0.000000 2023-10-25 14:52:04,500 epoch 7 - iter 1560/2606 - loss 0.04412206 - time (sec): 90.19 - samples/sec: 2437.18 - lr: 0.000019 - momentum: 0.000000 2023-10-25 14:52:19,146 epoch 7 - iter 1820/2606 - loss 0.04560131 - time (sec): 104.84 - samples/sec: 2483.64 - lr: 0.000018 - momentum: 0.000000 2023-10-25 14:52:33,348 epoch 7 - iter 2080/2606 - loss 0.04383087 - time (sec): 119.04 - samples/sec: 2498.57 - lr: 0.000018 - momentum: 0.000000 2023-10-25 14:52:47,443 epoch 7 - iter 2340/2606 - loss 0.04287800 - time (sec): 133.13 - samples/sec: 2508.05 - lr: 0.000017 - momentum: 0.000000 2023-10-25 14:53:00,987 epoch 7 - iter 2600/2606 - loss 0.04293301 - time (sec): 146.68 - samples/sec: 2499.36 - lr: 0.000017 - momentum: 0.000000 2023-10-25 14:53:01,273 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:53:01,273 EPOCH 7 done: loss 0.0430 - lr: 0.000017 2023-10-25 14:53:07,539 DEV : loss 0.36888352036476135 - f1-score (micro avg) 0.3547 2023-10-25 14:53:07,566 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:53:22,206 epoch 8 - iter 260/2606 - loss 0.02816273 - time (sec): 14.64 - samples/sec: 2654.17 - lr: 0.000016 - momentum: 0.000000 2023-10-25 14:53:36,106 epoch 8 - iter 520/2606 - loss 0.02706506 - time (sec): 28.54 - samples/sec: 2565.72 - lr: 0.000016 - momentum: 0.000000 2023-10-25 14:53:50,421 epoch 8 - iter 780/2606 - loss 0.03077908 - time (sec): 42.85 - samples/sec: 2559.81 - lr: 0.000015 - momentum: 0.000000 2023-10-25 14:54:04,697 epoch 8 - iter 1040/2606 - loss 0.03177140 - time (sec): 57.13 - samples/sec: 2549.73 - lr: 0.000014 - momentum: 0.000000 2023-10-25 14:54:19,291 epoch 8 - iter 1300/2606 - loss 0.03241797 - time (sec): 71.72 - samples/sec: 2560.08 - lr: 0.000014 - momentum: 0.000000 2023-10-25 14:54:34,241 epoch 8 - iter 1560/2606 - loss 0.04472240 - time (sec): 86.67 - samples/sec: 2569.48 - lr: 0.000013 - momentum: 0.000000 2023-10-25 14:54:48,203 epoch 8 - iter 1820/2606 - loss 0.06082510 - time (sec): 100.64 - samples/sec: 2564.29 - lr: 0.000013 - momentum: 0.000000 2023-10-25 14:55:03,356 epoch 8 - iter 2080/2606 - loss 0.07237813 - time (sec): 115.79 - samples/sec: 2590.36 - lr: 0.000012 - momentum: 0.000000 2023-10-25 14:55:17,114 epoch 8 - iter 2340/2606 - loss 0.07735544 - time (sec): 129.55 - samples/sec: 2577.12 - lr: 0.000012 - momentum: 0.000000 2023-10-25 14:55:30,594 epoch 8 - iter 2600/2606 - loss 0.08074278 - time (sec): 143.03 - samples/sec: 2562.17 - lr: 0.000011 - momentum: 0.000000 2023-10-25 14:55:31,015 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:55:31,015 EPOCH 8 done: loss 0.0806 - lr: 0.000011 2023-10-25 14:55:37,310 DEV : loss 0.3248702585697174 - f1-score (micro avg) 0.2139 2023-10-25 14:55:37,335 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:55:51,369 epoch 9 - iter 260/2606 - loss 0.09042002 - time (sec): 14.03 - samples/sec: 2440.70 - lr: 0.000011 - momentum: 0.000000 2023-10-25 14:56:05,964 epoch 9 - iter 520/2606 - loss 0.10574243 - time (sec): 28.63 - samples/sec: 2529.19 - lr: 0.000010 - momentum: 0.000000 2023-10-25 14:56:20,271 epoch 9 - iter 780/2606 - loss 0.10510265 - time (sec): 42.93 - samples/sec: 2535.61 - lr: 0.000009 - momentum: 0.000000 2023-10-25 14:56:34,213 epoch 9 - iter 1040/2606 - loss 0.11581644 - time (sec): 56.88 - samples/sec: 2537.07 - lr: 0.000009 - momentum: 0.000000 2023-10-25 14:56:48,138 epoch 9 - iter 1300/2606 - loss 0.13853151 - time (sec): 70.80 - samples/sec: 2534.52 - lr: 0.000008 - momentum: 0.000000 2023-10-25 14:57:02,718 epoch 9 - iter 1560/2606 - loss 0.15621653 - time (sec): 85.38 - samples/sec: 2538.87 - lr: 0.000008 - momentum: 0.000000 2023-10-25 14:57:16,241 epoch 9 - iter 1820/2606 - loss 0.16924463 - time (sec): 98.90 - samples/sec: 2563.69 - lr: 0.000007 - momentum: 0.000000 2023-10-25 14:57:31,403 epoch 9 - iter 2080/2606 - loss 0.17046855 - time (sec): 114.07 - samples/sec: 2567.41 - lr: 0.000007 - momentum: 0.000000 2023-10-25 14:57:45,821 epoch 9 - iter 2340/2606 - loss 0.17400048 - time (sec): 128.48 - samples/sec: 2576.01 - lr: 0.000006 - momentum: 0.000000 2023-10-25 14:57:59,796 epoch 9 - iter 2600/2606 - loss 0.17816634 - time (sec): 142.46 - samples/sec: 2574.61 - lr: 0.000006 - momentum: 0.000000 2023-10-25 14:58:00,124 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:58:00,124 EPOCH 9 done: loss 0.1782 - lr: 0.000006 2023-10-25 14:58:06,453 DEV : loss 0.22723859548568726 - f1-score (micro avg) 0.0329 2023-10-25 14:58:06,479 ---------------------------------------------------------------------------------------------------- 2023-10-25 14:58:20,284 epoch 10 - iter 260/2606 - loss 0.19008451 - time (sec): 13.80 - samples/sec: 2551.98 - lr: 0.000005 - momentum: 0.000000 2023-10-25 14:58:34,669 epoch 10 - iter 520/2606 - loss 0.19479366 - time (sec): 28.19 - samples/sec: 2599.77 - lr: 0.000004 - momentum: 0.000000 2023-10-25 14:58:48,736 epoch 10 - iter 780/2606 - loss 0.19834988 - time (sec): 42.26 - samples/sec: 2530.31 - lr: 0.000004 - momentum: 0.000000 2023-10-25 14:59:02,964 epoch 10 - iter 1040/2606 - loss 0.19699333 - time (sec): 56.48 - samples/sec: 2567.60 - lr: 0.000003 - momentum: 0.000000 2023-10-25 14:59:17,539 epoch 10 - iter 1300/2606 - loss 0.18959408 - time (sec): 71.06 - samples/sec: 2580.81 - lr: 0.000003 - momentum: 0.000000 2023-10-25 14:59:32,058 epoch 10 - iter 1560/2606 - loss 0.18754436 - time (sec): 85.58 - samples/sec: 2585.89 - lr: 0.000002 - momentum: 0.000000 2023-10-25 14:59:46,141 epoch 10 - iter 1820/2606 - loss 0.19219019 - time (sec): 99.66 - samples/sec: 2586.46 - lr: 0.000002 - momentum: 0.000000 2023-10-25 15:00:00,445 epoch 10 - iter 2080/2606 - loss 0.19419016 - time (sec): 113.96 - samples/sec: 2574.81 - lr: 0.000001 - momentum: 0.000000 2023-10-25 15:00:14,275 epoch 10 - iter 2340/2606 - loss 0.19370314 - time (sec): 127.79 - samples/sec: 2571.63 - lr: 0.000001 - momentum: 0.000000 2023-10-25 15:00:29,133 epoch 10 - iter 2600/2606 - loss 0.19280325 - time (sec): 142.65 - samples/sec: 2569.33 - lr: 0.000000 - momentum: 0.000000 2023-10-25 15:00:29,439 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:00:29,439 EPOCH 10 done: loss 0.1929 - lr: 0.000000 2023-10-25 15:00:36,359 DEV : loss 0.25543084740638733 - f1-score (micro avg) 0.0519 2023-10-25 15:00:37,019 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:00:37,020 Loading model from best epoch ... 2023-10-25 15:00:39,021 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-25 15:00:49,051 Results: - F-score (micro) 0.4518 - F-score (macro) 0.3002 - Accuracy 0.2956 By class: precision recall f1-score support LOC 0.4964 0.5700 0.5307 1214 PER 0.3949 0.4394 0.4159 808 ORG 0.2628 0.2465 0.2544 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.4312 0.4745 0.4518 2390 macro avg 0.2885 0.3140 0.3002 2390 weighted avg 0.4245 0.4745 0.4477 2390 2023-10-25 15:00:49,051 ----------------------------------------------------------------------------------------------------