stefan-it's picture
Upload folder using huggingface_hub
aee6af9
2023-10-18 00:31:48,574 ----------------------------------------------------------------------------------------------------
2023-10-18 00:31:48,576 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 00:31:48,576 ----------------------------------------------------------------------------------------------------
2023-10-18 00:31:48,577 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-18 00:31:48,577 ----------------------------------------------------------------------------------------------------
2023-10-18 00:31:48,577 Train: 20847 sentences
2023-10-18 00:31:48,577 (train_with_dev=False, train_with_test=False)
2023-10-18 00:31:48,577 ----------------------------------------------------------------------------------------------------
2023-10-18 00:31:48,577 Training Params:
2023-10-18 00:31:48,577 - learning_rate: "5e-05"
2023-10-18 00:31:48,577 - mini_batch_size: "4"
2023-10-18 00:31:48,577 - max_epochs: "10"
2023-10-18 00:31:48,577 - shuffle: "True"
2023-10-18 00:31:48,577 ----------------------------------------------------------------------------------------------------
2023-10-18 00:31:48,577 Plugins:
2023-10-18 00:31:48,578 - TensorboardLogger
2023-10-18 00:31:48,578 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 00:31:48,578 ----------------------------------------------------------------------------------------------------
2023-10-18 00:31:48,578 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 00:31:48,578 - metric: "('micro avg', 'f1-score')"
2023-10-18 00:31:48,578 ----------------------------------------------------------------------------------------------------
2023-10-18 00:31:48,578 Computation:
2023-10-18 00:31:48,578 - compute on device: cuda:0
2023-10-18 00:31:48,578 - embedding storage: none
2023-10-18 00:31:48,578 ----------------------------------------------------------------------------------------------------
2023-10-18 00:31:48,578 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-18 00:31:48,578 ----------------------------------------------------------------------------------------------------
2023-10-18 00:31:48,578 ----------------------------------------------------------------------------------------------------
2023-10-18 00:31:48,579 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 00:32:29,918 epoch 1 - iter 521/5212 - loss 1.66906812 - time (sec): 41.34 - samples/sec: 930.71 - lr: 0.000005 - momentum: 0.000000
2023-10-18 00:33:10,635 epoch 1 - iter 1042/5212 - loss 1.03719361 - time (sec): 82.05 - samples/sec: 909.14 - lr: 0.000010 - momentum: 0.000000
2023-10-18 00:33:51,571 epoch 1 - iter 1563/5212 - loss 0.79777765 - time (sec): 122.99 - samples/sec: 892.20 - lr: 0.000015 - momentum: 0.000000
2023-10-18 00:34:33,364 epoch 1 - iter 2084/5212 - loss 0.66095486 - time (sec): 164.78 - samples/sec: 895.33 - lr: 0.000020 - momentum: 0.000000
2023-10-18 00:35:14,940 epoch 1 - iter 2605/5212 - loss 0.58101719 - time (sec): 206.36 - samples/sec: 885.33 - lr: 0.000025 - momentum: 0.000000
2023-10-18 00:35:56,337 epoch 1 - iter 3126/5212 - loss 0.52357318 - time (sec): 247.76 - samples/sec: 882.27 - lr: 0.000030 - momentum: 0.000000
2023-10-18 00:36:37,849 epoch 1 - iter 3647/5212 - loss 0.47988816 - time (sec): 289.27 - samples/sec: 891.55 - lr: 0.000035 - momentum: 0.000000
2023-10-18 00:37:19,590 epoch 1 - iter 4168/5212 - loss 0.45555744 - time (sec): 331.01 - samples/sec: 894.11 - lr: 0.000040 - momentum: 0.000000
2023-10-18 00:38:01,595 epoch 1 - iter 4689/5212 - loss 0.43054978 - time (sec): 373.02 - samples/sec: 889.91 - lr: 0.000045 - momentum: 0.000000
2023-10-18 00:38:42,289 epoch 1 - iter 5210/5212 - loss 0.41213934 - time (sec): 413.71 - samples/sec: 887.73 - lr: 0.000050 - momentum: 0.000000
2023-10-18 00:38:42,441 ----------------------------------------------------------------------------------------------------
2023-10-18 00:38:42,442 EPOCH 1 done: loss 0.4120 - lr: 0.000050
2023-10-18 00:38:50,396 DEV : loss 0.16113218665122986 - f1-score (micro avg) 0.3046
2023-10-18 00:38:50,458 saving best model
2023-10-18 00:38:51,056 ----------------------------------------------------------------------------------------------------
2023-10-18 00:39:33,061 epoch 2 - iter 521/5212 - loss 0.20391832 - time (sec): 42.00 - samples/sec: 861.95 - lr: 0.000049 - momentum: 0.000000
2023-10-18 00:40:16,191 epoch 2 - iter 1042/5212 - loss 0.20211911 - time (sec): 85.13 - samples/sec: 855.50 - lr: 0.000049 - momentum: 0.000000
2023-10-18 00:40:57,547 epoch 2 - iter 1563/5212 - loss 0.19999558 - time (sec): 126.49 - samples/sec: 860.72 - lr: 0.000048 - momentum: 0.000000
2023-10-18 00:41:39,640 epoch 2 - iter 2084/5212 - loss 0.20513933 - time (sec): 168.58 - samples/sec: 854.18 - lr: 0.000048 - momentum: 0.000000
2023-10-18 00:42:21,573 epoch 2 - iter 2605/5212 - loss 0.20038286 - time (sec): 210.51 - samples/sec: 863.40 - lr: 0.000047 - momentum: 0.000000
2023-10-18 00:43:02,141 epoch 2 - iter 3126/5212 - loss 0.19864827 - time (sec): 251.08 - samples/sec: 869.12 - lr: 0.000047 - momentum: 0.000000
2023-10-18 00:43:42,673 epoch 2 - iter 3647/5212 - loss 0.19854493 - time (sec): 291.61 - samples/sec: 867.61 - lr: 0.000046 - momentum: 0.000000
2023-10-18 00:44:24,671 epoch 2 - iter 4168/5212 - loss 0.19715426 - time (sec): 333.61 - samples/sec: 878.42 - lr: 0.000046 - momentum: 0.000000
2023-10-18 00:45:07,789 epoch 2 - iter 4689/5212 - loss 0.19743047 - time (sec): 376.73 - samples/sec: 881.73 - lr: 0.000045 - momentum: 0.000000
2023-10-18 00:45:48,782 epoch 2 - iter 5210/5212 - loss 0.19494220 - time (sec): 417.72 - samples/sec: 879.26 - lr: 0.000044 - momentum: 0.000000
2023-10-18 00:45:48,936 ----------------------------------------------------------------------------------------------------
2023-10-18 00:45:48,936 EPOCH 2 done: loss 0.1949 - lr: 0.000044
2023-10-18 00:46:01,264 DEV : loss 0.1894654929637909 - f1-score (micro avg) 0.3482
2023-10-18 00:46:01,333 saving best model
2023-10-18 00:46:02,756 ----------------------------------------------------------------------------------------------------
2023-10-18 00:46:44,151 epoch 3 - iter 521/5212 - loss 0.15457492 - time (sec): 41.39 - samples/sec: 873.70 - lr: 0.000044 - momentum: 0.000000
2023-10-18 00:47:25,991 epoch 3 - iter 1042/5212 - loss 0.14322200 - time (sec): 83.23 - samples/sec: 885.31 - lr: 0.000043 - momentum: 0.000000
2023-10-18 00:48:06,252 epoch 3 - iter 1563/5212 - loss 0.14225428 - time (sec): 123.49 - samples/sec: 875.22 - lr: 0.000043 - momentum: 0.000000
2023-10-18 00:48:48,040 epoch 3 - iter 2084/5212 - loss 0.14483257 - time (sec): 165.28 - samples/sec: 888.19 - lr: 0.000042 - momentum: 0.000000
2023-10-18 00:49:30,204 epoch 3 - iter 2605/5212 - loss 0.14346341 - time (sec): 207.44 - samples/sec: 881.51 - lr: 0.000042 - momentum: 0.000000
2023-10-18 00:50:12,207 epoch 3 - iter 3126/5212 - loss 0.14044970 - time (sec): 249.45 - samples/sec: 882.67 - lr: 0.000041 - momentum: 0.000000
2023-10-18 00:50:54,038 epoch 3 - iter 3647/5212 - loss 0.14195401 - time (sec): 291.28 - samples/sec: 887.79 - lr: 0.000041 - momentum: 0.000000
2023-10-18 00:51:36,494 epoch 3 - iter 4168/5212 - loss 0.14437001 - time (sec): 333.73 - samples/sec: 882.83 - lr: 0.000040 - momentum: 0.000000
2023-10-18 00:52:18,181 epoch 3 - iter 4689/5212 - loss 0.14670163 - time (sec): 375.42 - samples/sec: 877.15 - lr: 0.000039 - momentum: 0.000000
2023-10-18 00:53:00,008 epoch 3 - iter 5210/5212 - loss 0.14610716 - time (sec): 417.25 - samples/sec: 880.17 - lr: 0.000039 - momentum: 0.000000
2023-10-18 00:53:00,154 ----------------------------------------------------------------------------------------------------
2023-10-18 00:53:00,155 EPOCH 3 done: loss 0.1461 - lr: 0.000039
2023-10-18 00:53:11,968 DEV : loss 0.20233668386936188 - f1-score (micro avg) 0.3894
2023-10-18 00:53:12,019 saving best model
2023-10-18 00:53:13,437 ----------------------------------------------------------------------------------------------------
2023-10-18 00:53:55,104 epoch 4 - iter 521/5212 - loss 0.11142981 - time (sec): 41.66 - samples/sec: 895.43 - lr: 0.000038 - momentum: 0.000000
2023-10-18 00:54:39,019 epoch 4 - iter 1042/5212 - loss 0.10715561 - time (sec): 85.58 - samples/sec: 863.93 - lr: 0.000038 - momentum: 0.000000
2023-10-18 00:55:20,313 epoch 4 - iter 1563/5212 - loss 0.11064029 - time (sec): 126.87 - samples/sec: 878.12 - lr: 0.000037 - momentum: 0.000000
2023-10-18 00:56:01,136 epoch 4 - iter 2084/5212 - loss 0.11448970 - time (sec): 167.69 - samples/sec: 872.48 - lr: 0.000037 - momentum: 0.000000
2023-10-18 00:56:41,309 epoch 4 - iter 2605/5212 - loss 0.11567949 - time (sec): 207.87 - samples/sec: 877.17 - lr: 0.000036 - momentum: 0.000000
2023-10-18 00:57:22,189 epoch 4 - iter 3126/5212 - loss 0.11662647 - time (sec): 248.75 - samples/sec: 881.56 - lr: 0.000036 - momentum: 0.000000
2023-10-18 00:58:03,052 epoch 4 - iter 3647/5212 - loss 0.11508943 - time (sec): 289.61 - samples/sec: 883.76 - lr: 0.000035 - momentum: 0.000000
2023-10-18 00:58:45,312 epoch 4 - iter 4168/5212 - loss 0.11455258 - time (sec): 331.87 - samples/sec: 886.66 - lr: 0.000034 - momentum: 0.000000
2023-10-18 00:59:25,409 epoch 4 - iter 4689/5212 - loss 0.11467035 - time (sec): 371.97 - samples/sec: 887.99 - lr: 0.000034 - momentum: 0.000000
2023-10-18 01:00:03,308 epoch 4 - iter 5210/5212 - loss 0.11262872 - time (sec): 409.87 - samples/sec: 896.20 - lr: 0.000033 - momentum: 0.000000
2023-10-18 01:00:03,451 ----------------------------------------------------------------------------------------------------
2023-10-18 01:00:03,452 EPOCH 4 done: loss 0.1126 - lr: 0.000033
2023-10-18 01:00:15,154 DEV : loss 0.2687111496925354 - f1-score (micro avg) 0.3657
2023-10-18 01:00:15,206 ----------------------------------------------------------------------------------------------------
2023-10-18 01:00:54,028 epoch 5 - iter 521/5212 - loss 0.07311031 - time (sec): 38.82 - samples/sec: 928.03 - lr: 0.000033 - momentum: 0.000000
2023-10-18 01:01:34,435 epoch 5 - iter 1042/5212 - loss 0.08017798 - time (sec): 79.23 - samples/sec: 944.73 - lr: 0.000032 - momentum: 0.000000
2023-10-18 01:02:15,896 epoch 5 - iter 1563/5212 - loss 0.07619780 - time (sec): 120.69 - samples/sec: 945.18 - lr: 0.000032 - momentum: 0.000000
2023-10-18 01:02:56,684 epoch 5 - iter 2084/5212 - loss 0.07909195 - time (sec): 161.48 - samples/sec: 924.66 - lr: 0.000031 - momentum: 0.000000
2023-10-18 01:03:37,694 epoch 5 - iter 2605/5212 - loss 0.08215076 - time (sec): 202.49 - samples/sec: 918.77 - lr: 0.000031 - momentum: 0.000000
2023-10-18 01:04:18,619 epoch 5 - iter 3126/5212 - loss 0.08334033 - time (sec): 243.41 - samples/sec: 920.85 - lr: 0.000030 - momentum: 0.000000
2023-10-18 01:04:59,547 epoch 5 - iter 3647/5212 - loss 0.08207112 - time (sec): 284.34 - samples/sec: 917.40 - lr: 0.000029 - momentum: 0.000000
2023-10-18 01:05:40,848 epoch 5 - iter 4168/5212 - loss 0.08331101 - time (sec): 325.64 - samples/sec: 913.88 - lr: 0.000029 - momentum: 0.000000
2023-10-18 01:06:20,943 epoch 5 - iter 4689/5212 - loss 0.08261802 - time (sec): 365.73 - samples/sec: 912.70 - lr: 0.000028 - momentum: 0.000000
2023-10-18 01:07:01,129 epoch 5 - iter 5210/5212 - loss 0.08358709 - time (sec): 405.92 - samples/sec: 905.02 - lr: 0.000028 - momentum: 0.000000
2023-10-18 01:07:01,270 ----------------------------------------------------------------------------------------------------
2023-10-18 01:07:01,271 EPOCH 5 done: loss 0.0836 - lr: 0.000028
2023-10-18 01:07:12,187 DEV : loss 0.301369845867157 - f1-score (micro avg) 0.3742
2023-10-18 01:07:12,250 ----------------------------------------------------------------------------------------------------
2023-10-18 01:07:54,992 epoch 6 - iter 521/5212 - loss 0.06866396 - time (sec): 42.74 - samples/sec: 896.81 - lr: 0.000027 - momentum: 0.000000
2023-10-18 01:08:36,567 epoch 6 - iter 1042/5212 - loss 0.06896073 - time (sec): 84.31 - samples/sec: 882.54 - lr: 0.000027 - momentum: 0.000000
2023-10-18 01:09:16,846 epoch 6 - iter 1563/5212 - loss 0.06751622 - time (sec): 124.59 - samples/sec: 906.02 - lr: 0.000026 - momentum: 0.000000
2023-10-18 01:09:57,387 epoch 6 - iter 2084/5212 - loss 0.06455834 - time (sec): 165.13 - samples/sec: 914.12 - lr: 0.000026 - momentum: 0.000000
2023-10-18 01:10:38,142 epoch 6 - iter 2605/5212 - loss 0.06669693 - time (sec): 205.89 - samples/sec: 905.24 - lr: 0.000025 - momentum: 0.000000
2023-10-18 01:11:18,567 epoch 6 - iter 3126/5212 - loss 0.06865838 - time (sec): 246.31 - samples/sec: 891.57 - lr: 0.000024 - momentum: 0.000000
2023-10-18 01:12:00,376 epoch 6 - iter 3647/5212 - loss 0.06873425 - time (sec): 288.12 - samples/sec: 886.06 - lr: 0.000024 - momentum: 0.000000
2023-10-18 01:12:43,940 epoch 6 - iter 4168/5212 - loss 0.06803666 - time (sec): 331.69 - samples/sec: 877.70 - lr: 0.000023 - momentum: 0.000000
2023-10-18 01:13:25,958 epoch 6 - iter 4689/5212 - loss 0.06603815 - time (sec): 373.71 - samples/sec: 881.12 - lr: 0.000023 - momentum: 0.000000
2023-10-18 01:14:07,418 epoch 6 - iter 5210/5212 - loss 0.06522923 - time (sec): 415.17 - samples/sec: 884.91 - lr: 0.000022 - momentum: 0.000000
2023-10-18 01:14:07,588 ----------------------------------------------------------------------------------------------------
2023-10-18 01:14:07,588 EPOCH 6 done: loss 0.0652 - lr: 0.000022
2023-10-18 01:14:18,954 DEV : loss 0.36671698093414307 - f1-score (micro avg) 0.3549
2023-10-18 01:14:19,009 ----------------------------------------------------------------------------------------------------
2023-10-18 01:15:01,826 epoch 7 - iter 521/5212 - loss 0.03359695 - time (sec): 42.81 - samples/sec: 885.06 - lr: 0.000022 - momentum: 0.000000
2023-10-18 01:15:44,187 epoch 7 - iter 1042/5212 - loss 0.03709266 - time (sec): 85.17 - samples/sec: 872.76 - lr: 0.000021 - momentum: 0.000000
2023-10-18 01:16:24,595 epoch 7 - iter 1563/5212 - loss 0.04253411 - time (sec): 125.58 - samples/sec: 864.90 - lr: 0.000021 - momentum: 0.000000
2023-10-18 01:17:05,737 epoch 7 - iter 2084/5212 - loss 0.04345550 - time (sec): 166.73 - samples/sec: 867.32 - lr: 0.000020 - momentum: 0.000000
2023-10-18 01:17:47,521 epoch 7 - iter 2605/5212 - loss 0.04796044 - time (sec): 208.51 - samples/sec: 871.07 - lr: 0.000019 - momentum: 0.000000
2023-10-18 01:18:28,730 epoch 7 - iter 3126/5212 - loss 0.04973625 - time (sec): 249.72 - samples/sec: 866.88 - lr: 0.000019 - momentum: 0.000000
2023-10-18 01:19:10,790 epoch 7 - iter 3647/5212 - loss 0.04988937 - time (sec): 291.78 - samples/sec: 871.16 - lr: 0.000018 - momentum: 0.000000
2023-10-18 01:19:53,123 epoch 7 - iter 4168/5212 - loss 0.04833380 - time (sec): 334.11 - samples/sec: 886.96 - lr: 0.000018 - momentum: 0.000000
2023-10-18 01:20:33,853 epoch 7 - iter 4689/5212 - loss 0.04815110 - time (sec): 374.84 - samples/sec: 881.91 - lr: 0.000017 - momentum: 0.000000
2023-10-18 01:21:14,383 epoch 7 - iter 5210/5212 - loss 0.04638672 - time (sec): 415.37 - samples/sec: 884.32 - lr: 0.000017 - momentum: 0.000000
2023-10-18 01:21:14,530 ----------------------------------------------------------------------------------------------------
2023-10-18 01:21:14,530 EPOCH 7 done: loss 0.0464 - lr: 0.000017
2023-10-18 01:21:25,727 DEV : loss 0.43325743079185486 - f1-score (micro avg) 0.3587
2023-10-18 01:21:25,783 ----------------------------------------------------------------------------------------------------
2023-10-18 01:22:07,559 epoch 8 - iter 521/5212 - loss 0.04888283 - time (sec): 41.77 - samples/sec: 868.38 - lr: 0.000016 - momentum: 0.000000
2023-10-18 01:22:51,475 epoch 8 - iter 1042/5212 - loss 0.05588160 - time (sec): 85.69 - samples/sec: 853.90 - lr: 0.000016 - momentum: 0.000000
2023-10-18 01:23:30,474 epoch 8 - iter 1563/5212 - loss 0.04998288 - time (sec): 124.69 - samples/sec: 860.88 - lr: 0.000015 - momentum: 0.000000
2023-10-18 01:24:11,992 epoch 8 - iter 2084/5212 - loss 0.04736604 - time (sec): 166.21 - samples/sec: 860.69 - lr: 0.000014 - momentum: 0.000000
2023-10-18 01:24:54,218 epoch 8 - iter 2605/5212 - loss 0.04423360 - time (sec): 208.43 - samples/sec: 861.12 - lr: 0.000014 - momentum: 0.000000
2023-10-18 01:25:36,727 epoch 8 - iter 3126/5212 - loss 0.04220802 - time (sec): 250.94 - samples/sec: 867.37 - lr: 0.000013 - momentum: 0.000000
2023-10-18 01:26:21,477 epoch 8 - iter 3647/5212 - loss 0.03990034 - time (sec): 295.69 - samples/sec: 871.35 - lr: 0.000013 - momentum: 0.000000
2023-10-18 01:27:07,438 epoch 8 - iter 4168/5212 - loss 0.03988533 - time (sec): 341.65 - samples/sec: 864.12 - lr: 0.000012 - momentum: 0.000000
2023-10-18 01:27:50,105 epoch 8 - iter 4689/5212 - loss 0.03847361 - time (sec): 384.32 - samples/sec: 859.02 - lr: 0.000012 - momentum: 0.000000
2023-10-18 01:28:33,305 epoch 8 - iter 5210/5212 - loss 0.03727021 - time (sec): 427.52 - samples/sec: 858.96 - lr: 0.000011 - momentum: 0.000000
2023-10-18 01:28:33,460 ----------------------------------------------------------------------------------------------------
2023-10-18 01:28:33,460 EPOCH 8 done: loss 0.0373 - lr: 0.000011
2023-10-18 01:28:44,615 DEV : loss 0.4129716157913208 - f1-score (micro avg) 0.3669
2023-10-18 01:28:44,677 ----------------------------------------------------------------------------------------------------
2023-10-18 01:29:29,811 epoch 9 - iter 521/5212 - loss 0.01869215 - time (sec): 45.13 - samples/sec: 865.28 - lr: 0.000011 - momentum: 0.000000
2023-10-18 01:30:11,850 epoch 9 - iter 1042/5212 - loss 0.02072276 - time (sec): 87.17 - samples/sec: 861.21 - lr: 0.000010 - momentum: 0.000000
2023-10-18 01:30:55,037 epoch 9 - iter 1563/5212 - loss 0.02156431 - time (sec): 130.36 - samples/sec: 845.78 - lr: 0.000009 - momentum: 0.000000
2023-10-18 01:31:38,177 epoch 9 - iter 2084/5212 - loss 0.02107749 - time (sec): 173.50 - samples/sec: 833.83 - lr: 0.000009 - momentum: 0.000000
2023-10-18 01:32:20,284 epoch 9 - iter 2605/5212 - loss 0.02033570 - time (sec): 215.60 - samples/sec: 833.24 - lr: 0.000008 - momentum: 0.000000
2023-10-18 01:33:00,969 epoch 9 - iter 3126/5212 - loss 0.02020637 - time (sec): 256.29 - samples/sec: 847.42 - lr: 0.000008 - momentum: 0.000000
2023-10-18 01:33:41,854 epoch 9 - iter 3647/5212 - loss 0.02018397 - time (sec): 297.17 - samples/sec: 856.84 - lr: 0.000007 - momentum: 0.000000
2023-10-18 01:34:24,795 epoch 9 - iter 4168/5212 - loss 0.01967850 - time (sec): 340.12 - samples/sec: 863.34 - lr: 0.000007 - momentum: 0.000000
2023-10-18 01:35:06,699 epoch 9 - iter 4689/5212 - loss 0.01954663 - time (sec): 382.02 - samples/sec: 864.64 - lr: 0.000006 - momentum: 0.000000
2023-10-18 01:35:47,910 epoch 9 - iter 5210/5212 - loss 0.01998115 - time (sec): 423.23 - samples/sec: 867.76 - lr: 0.000006 - momentum: 0.000000
2023-10-18 01:35:48,061 ----------------------------------------------------------------------------------------------------
2023-10-18 01:35:48,061 EPOCH 9 done: loss 0.0200 - lr: 0.000006
2023-10-18 01:36:00,418 DEV : loss 0.4562455713748932 - f1-score (micro avg) 0.3877
2023-10-18 01:36:00,486 ----------------------------------------------------------------------------------------------------
2023-10-18 01:36:44,282 epoch 10 - iter 521/5212 - loss 0.01029065 - time (sec): 43.79 - samples/sec: 856.77 - lr: 0.000005 - momentum: 0.000000
2023-10-18 01:37:26,680 epoch 10 - iter 1042/5212 - loss 0.01334978 - time (sec): 86.19 - samples/sec: 866.79 - lr: 0.000004 - momentum: 0.000000
2023-10-18 01:38:06,606 epoch 10 - iter 1563/5212 - loss 0.01386083 - time (sec): 126.12 - samples/sec: 869.64 - lr: 0.000004 - momentum: 0.000000
2023-10-18 01:38:46,290 epoch 10 - iter 2084/5212 - loss 0.01335369 - time (sec): 165.80 - samples/sec: 867.67 - lr: 0.000003 - momentum: 0.000000
2023-10-18 01:39:27,175 epoch 10 - iter 2605/5212 - loss 0.01357185 - time (sec): 206.69 - samples/sec: 888.86 - lr: 0.000003 - momentum: 0.000000
2023-10-18 01:40:07,806 epoch 10 - iter 3126/5212 - loss 0.01324602 - time (sec): 247.32 - samples/sec: 893.64 - lr: 0.000002 - momentum: 0.000000
2023-10-18 01:40:48,240 epoch 10 - iter 3647/5212 - loss 0.01325955 - time (sec): 287.75 - samples/sec: 904.07 - lr: 0.000002 - momentum: 0.000000
2023-10-18 01:41:28,545 epoch 10 - iter 4168/5212 - loss 0.01305894 - time (sec): 328.06 - samples/sec: 901.55 - lr: 0.000001 - momentum: 0.000000
2023-10-18 01:42:08,664 epoch 10 - iter 4689/5212 - loss 0.01288314 - time (sec): 368.18 - samples/sec: 897.88 - lr: 0.000001 - momentum: 0.000000
2023-10-18 01:42:48,729 epoch 10 - iter 5210/5212 - loss 0.01314818 - time (sec): 408.24 - samples/sec: 899.94 - lr: 0.000000 - momentum: 0.000000
2023-10-18 01:42:48,862 ----------------------------------------------------------------------------------------------------
2023-10-18 01:42:48,862 EPOCH 10 done: loss 0.0131 - lr: 0.000000
2023-10-18 01:43:00,991 DEV : loss 0.47269320487976074 - f1-score (micro avg) 0.3881
2023-10-18 01:43:01,627 ----------------------------------------------------------------------------------------------------
2023-10-18 01:43:01,629 Loading model from best epoch ...
2023-10-18 01:43:04,487 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-18 01:43:23,620
Results:
- F-score (micro) 0.3373
- F-score (macro) 0.219
- Accuracy 0.2047
By class:
precision recall f1-score support
LOC 0.4867 0.3624 0.4155 1214
PER 0.3677 0.2426 0.2923 808
ORG 0.2075 0.1416 0.1684 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4088 0.2870 0.3373 2390
macro avg 0.2655 0.1867 0.2190 2390
weighted avg 0.4022 0.2870 0.3347 2390
2023-10-18 01:43:23,620 ----------------------------------------------------------------------------------------------------