|
2023-10-25 20:55:34,863 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:55:34,864 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 20:55:34,864 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:55:34,864 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-25 20:55:34,865 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:55:34,865 Train: 1166 sentences |
|
2023-10-25 20:55:34,865 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 20:55:34,865 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:55:34,865 Training Params: |
|
2023-10-25 20:55:34,865 - learning_rate: "3e-05" |
|
2023-10-25 20:55:34,865 - mini_batch_size: "8" |
|
2023-10-25 20:55:34,865 - max_epochs: "10" |
|
2023-10-25 20:55:34,865 - shuffle: "True" |
|
2023-10-25 20:55:34,865 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:55:34,865 Plugins: |
|
2023-10-25 20:55:34,865 - TensorboardLogger |
|
2023-10-25 20:55:34,865 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 20:55:34,865 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:55:34,865 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 20:55:34,865 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 20:55:34,865 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:55:34,865 Computation: |
|
2023-10-25 20:55:34,865 - compute on device: cuda:0 |
|
2023-10-25 20:55:34,865 - embedding storage: none |
|
2023-10-25 20:55:34,865 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:55:34,865 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-25 20:55:34,865 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:55:34,865 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:55:34,865 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 20:55:35,640 epoch 1 - iter 14/146 - loss 3.10902494 - time (sec): 0.77 - samples/sec: 4773.34 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 20:55:36,618 epoch 1 - iter 28/146 - loss 2.59619179 - time (sec): 1.75 - samples/sec: 4772.93 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 20:55:37,774 epoch 1 - iter 42/146 - loss 2.01643460 - time (sec): 2.91 - samples/sec: 4711.02 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 20:55:38,710 epoch 1 - iter 56/146 - loss 1.68487562 - time (sec): 3.84 - samples/sec: 4710.19 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 20:55:39,576 epoch 1 - iter 70/146 - loss 1.45782080 - time (sec): 4.71 - samples/sec: 4746.43 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 20:55:40,512 epoch 1 - iter 84/146 - loss 1.28264672 - time (sec): 5.65 - samples/sec: 4777.01 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 20:55:41,335 epoch 1 - iter 98/146 - loss 1.17339039 - time (sec): 6.47 - samples/sec: 4740.81 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 20:55:42,097 epoch 1 - iter 112/146 - loss 1.09059636 - time (sec): 7.23 - samples/sec: 4726.11 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 20:55:42,956 epoch 1 - iter 126/146 - loss 1.01664625 - time (sec): 8.09 - samples/sec: 4687.19 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 20:55:43,836 epoch 1 - iter 140/146 - loss 0.94414109 - time (sec): 8.97 - samples/sec: 4650.42 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 20:55:44,354 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:55:44,354 EPOCH 1 done: loss 0.9110 - lr: 0.000029 |
|
2023-10-25 20:55:44,864 DEV : loss 0.17841865122318268 - f1-score (micro avg) 0.5106 |
|
2023-10-25 20:55:44,868 saving best model |
|
2023-10-25 20:55:45,342 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:55:46,236 epoch 2 - iter 14/146 - loss 0.22197690 - time (sec): 0.89 - samples/sec: 4573.33 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 20:55:47,071 epoch 2 - iter 28/146 - loss 0.22650138 - time (sec): 1.73 - samples/sec: 4520.68 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 20:55:47,977 epoch 2 - iter 42/146 - loss 0.20323981 - time (sec): 2.63 - samples/sec: 4543.88 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 20:55:48,943 epoch 2 - iter 56/146 - loss 0.19757898 - time (sec): 3.60 - samples/sec: 4576.31 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 20:55:49,861 epoch 2 - iter 70/146 - loss 0.18298851 - time (sec): 4.52 - samples/sec: 4578.63 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 20:55:50,682 epoch 2 - iter 84/146 - loss 0.19089007 - time (sec): 5.34 - samples/sec: 4558.37 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 20:55:51,555 epoch 2 - iter 98/146 - loss 0.19516507 - time (sec): 6.21 - samples/sec: 4552.38 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 20:55:52,432 epoch 2 - iter 112/146 - loss 0.20019871 - time (sec): 7.09 - samples/sec: 4543.80 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 20:55:53,462 epoch 2 - iter 126/146 - loss 0.19843670 - time (sec): 8.12 - samples/sec: 4621.36 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 20:55:54,338 epoch 2 - iter 140/146 - loss 0.18651570 - time (sec): 9.00 - samples/sec: 4727.20 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 20:55:54,720 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:55:54,720 EPOCH 2 done: loss 0.1838 - lr: 0.000027 |
|
2023-10-25 20:55:55,793 DEV : loss 0.12096831947565079 - f1-score (micro avg) 0.6513 |
|
2023-10-25 20:55:55,797 saving best model |
|
2023-10-25 20:55:56,416 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:55:57,396 epoch 3 - iter 14/146 - loss 0.15771303 - time (sec): 0.98 - samples/sec: 4830.09 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 20:55:58,356 epoch 3 - iter 28/146 - loss 0.12392787 - time (sec): 1.94 - samples/sec: 4813.12 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 20:55:59,297 epoch 3 - iter 42/146 - loss 0.11174698 - time (sec): 2.88 - samples/sec: 4710.38 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 20:56:00,178 epoch 3 - iter 56/146 - loss 0.10673801 - time (sec): 3.76 - samples/sec: 4633.16 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 20:56:00,983 epoch 3 - iter 70/146 - loss 0.10484329 - time (sec): 4.57 - samples/sec: 4638.16 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 20:56:01,756 epoch 3 - iter 84/146 - loss 0.10244290 - time (sec): 5.34 - samples/sec: 4651.57 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 20:56:02,607 epoch 3 - iter 98/146 - loss 0.09985140 - time (sec): 6.19 - samples/sec: 4726.40 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 20:56:03,479 epoch 3 - iter 112/146 - loss 0.09841682 - time (sec): 7.06 - samples/sec: 4758.33 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 20:56:04,428 epoch 3 - iter 126/146 - loss 0.10081405 - time (sec): 8.01 - samples/sec: 4808.81 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 20:56:05,404 epoch 3 - iter 140/146 - loss 0.09934642 - time (sec): 8.99 - samples/sec: 4726.02 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 20:56:05,839 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:56:05,839 EPOCH 3 done: loss 0.1008 - lr: 0.000024 |
|
2023-10-25 20:56:06,762 DEV : loss 0.09357891231775284 - f1-score (micro avg) 0.7661 |
|
2023-10-25 20:56:06,767 saving best model |
|
2023-10-25 20:56:07,381 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:56:08,332 epoch 4 - iter 14/146 - loss 0.07334869 - time (sec): 0.95 - samples/sec: 4347.91 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 20:56:09,201 epoch 4 - iter 28/146 - loss 0.05889127 - time (sec): 1.82 - samples/sec: 4791.71 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 20:56:10,065 epoch 4 - iter 42/146 - loss 0.05796772 - time (sec): 2.68 - samples/sec: 4682.98 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 20:56:10,954 epoch 4 - iter 56/146 - loss 0.05708129 - time (sec): 3.57 - samples/sec: 4669.10 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 20:56:11,952 epoch 4 - iter 70/146 - loss 0.05918055 - time (sec): 4.57 - samples/sec: 4664.61 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 20:56:12,839 epoch 4 - iter 84/146 - loss 0.05873412 - time (sec): 5.46 - samples/sec: 4629.08 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 20:56:13,643 epoch 4 - iter 98/146 - loss 0.06356687 - time (sec): 6.26 - samples/sec: 4695.36 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 20:56:14,687 epoch 4 - iter 112/146 - loss 0.06533479 - time (sec): 7.30 - samples/sec: 4693.91 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 20:56:15,509 epoch 4 - iter 126/146 - loss 0.06305942 - time (sec): 8.13 - samples/sec: 4734.69 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 20:56:16,412 epoch 4 - iter 140/146 - loss 0.06429593 - time (sec): 9.03 - samples/sec: 4722.19 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 20:56:16,757 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:56:16,757 EPOCH 4 done: loss 0.0629 - lr: 0.000020 |
|
2023-10-25 20:56:17,680 DEV : loss 0.0992453321814537 - f1-score (micro avg) 0.7659 |
|
2023-10-25 20:56:17,685 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:56:18,460 epoch 5 - iter 14/146 - loss 0.02787161 - time (sec): 0.77 - samples/sec: 4584.62 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 20:56:19,460 epoch 5 - iter 28/146 - loss 0.03280871 - time (sec): 1.77 - samples/sec: 4512.17 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 20:56:20,354 epoch 5 - iter 42/146 - loss 0.03579868 - time (sec): 2.67 - samples/sec: 4711.75 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 20:56:21,268 epoch 5 - iter 56/146 - loss 0.03587840 - time (sec): 3.58 - samples/sec: 4791.26 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 20:56:22,054 epoch 5 - iter 70/146 - loss 0.03779275 - time (sec): 4.37 - samples/sec: 4749.61 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 20:56:23,054 epoch 5 - iter 84/146 - loss 0.04022006 - time (sec): 5.37 - samples/sec: 4673.40 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 20:56:24,062 epoch 5 - iter 98/146 - loss 0.04334585 - time (sec): 6.38 - samples/sec: 4688.25 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 20:56:25,023 epoch 5 - iter 112/146 - loss 0.04098482 - time (sec): 7.34 - samples/sec: 4679.22 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 20:56:25,909 epoch 5 - iter 126/146 - loss 0.04027951 - time (sec): 8.22 - samples/sec: 4689.20 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 20:56:26,808 epoch 5 - iter 140/146 - loss 0.03867150 - time (sec): 9.12 - samples/sec: 4720.78 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 20:56:27,140 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:56:27,140 EPOCH 5 done: loss 0.0396 - lr: 0.000017 |
|
2023-10-25 20:56:28,210 DEV : loss 0.11319706588983536 - f1-score (micro avg) 0.7379 |
|
2023-10-25 20:56:28,215 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:56:29,241 epoch 6 - iter 14/146 - loss 0.02816455 - time (sec): 1.03 - samples/sec: 5084.08 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 20:56:30,116 epoch 6 - iter 28/146 - loss 0.02905426 - time (sec): 1.90 - samples/sec: 4841.16 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 20:56:30,954 epoch 6 - iter 42/146 - loss 0.02846874 - time (sec): 2.74 - samples/sec: 4913.66 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 20:56:31,956 epoch 6 - iter 56/146 - loss 0.02944112 - time (sec): 3.74 - samples/sec: 4876.40 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 20:56:32,914 epoch 6 - iter 70/146 - loss 0.03328946 - time (sec): 4.70 - samples/sec: 4721.71 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 20:56:33,910 epoch 6 - iter 84/146 - loss 0.03096724 - time (sec): 5.69 - samples/sec: 4699.93 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 20:56:34,728 epoch 6 - iter 98/146 - loss 0.03029441 - time (sec): 6.51 - samples/sec: 4720.08 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 20:56:35,601 epoch 6 - iter 112/146 - loss 0.02919295 - time (sec): 7.39 - samples/sec: 4735.33 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 20:56:36,503 epoch 6 - iter 126/146 - loss 0.02985431 - time (sec): 8.29 - samples/sec: 4774.60 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 20:56:37,277 epoch 6 - iter 140/146 - loss 0.02917088 - time (sec): 9.06 - samples/sec: 4732.70 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 20:56:37,630 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:56:37,630 EPOCH 6 done: loss 0.0287 - lr: 0.000014 |
|
2023-10-25 20:56:38,547 DEV : loss 0.11916946619749069 - f1-score (micro avg) 0.7636 |
|
2023-10-25 20:56:38,552 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:56:39,434 epoch 7 - iter 14/146 - loss 0.03408250 - time (sec): 0.88 - samples/sec: 4638.09 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 20:56:40,274 epoch 7 - iter 28/146 - loss 0.02376972 - time (sec): 1.72 - samples/sec: 4866.26 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 20:56:41,435 epoch 7 - iter 42/146 - loss 0.02225046 - time (sec): 2.88 - samples/sec: 4656.63 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 20:56:42,359 epoch 7 - iter 56/146 - loss 0.02004467 - time (sec): 3.81 - samples/sec: 4646.00 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 20:56:43,163 epoch 7 - iter 70/146 - loss 0.01988317 - time (sec): 4.61 - samples/sec: 4728.26 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 20:56:44,111 epoch 7 - iter 84/146 - loss 0.01935512 - time (sec): 5.56 - samples/sec: 4736.65 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 20:56:44,903 epoch 7 - iter 98/146 - loss 0.01988505 - time (sec): 6.35 - samples/sec: 4774.12 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 20:56:45,711 epoch 7 - iter 112/146 - loss 0.01989052 - time (sec): 7.16 - samples/sec: 4781.19 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 20:56:46,501 epoch 7 - iter 126/146 - loss 0.02080479 - time (sec): 7.95 - samples/sec: 4797.20 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 20:56:47,354 epoch 7 - iter 140/146 - loss 0.02136461 - time (sec): 8.80 - samples/sec: 4867.84 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 20:56:47,687 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:56:47,687 EPOCH 7 done: loss 0.0210 - lr: 0.000010 |
|
2023-10-25 20:56:48,605 DEV : loss 0.1305539757013321 - f1-score (micro avg) 0.7559 |
|
2023-10-25 20:56:48,610 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:56:49,483 epoch 8 - iter 14/146 - loss 0.01066087 - time (sec): 0.87 - samples/sec: 4987.56 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 20:56:50,259 epoch 8 - iter 28/146 - loss 0.01924431 - time (sec): 1.65 - samples/sec: 5148.60 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 20:56:51,083 epoch 8 - iter 42/146 - loss 0.01917116 - time (sec): 2.47 - samples/sec: 4892.83 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 20:56:51,946 epoch 8 - iter 56/146 - loss 0.01664941 - time (sec): 3.33 - samples/sec: 4811.37 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 20:56:52,766 epoch 8 - iter 70/146 - loss 0.01525997 - time (sec): 4.15 - samples/sec: 4948.62 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 20:56:53,655 epoch 8 - iter 84/146 - loss 0.01445365 - time (sec): 5.04 - samples/sec: 4984.69 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 20:56:54,523 epoch 8 - iter 98/146 - loss 0.01414525 - time (sec): 5.91 - samples/sec: 4985.90 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 20:56:55,513 epoch 8 - iter 112/146 - loss 0.01389800 - time (sec): 6.90 - samples/sec: 4901.11 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 20:56:56,565 epoch 8 - iter 126/146 - loss 0.01312728 - time (sec): 7.95 - samples/sec: 4900.13 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 20:56:57,365 epoch 8 - iter 140/146 - loss 0.01274159 - time (sec): 8.75 - samples/sec: 4885.71 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 20:56:57,779 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:56:57,780 EPOCH 8 done: loss 0.0149 - lr: 0.000007 |
|
2023-10-25 20:56:58,699 DEV : loss 0.14879070222377777 - f1-score (micro avg) 0.7646 |
|
2023-10-25 20:56:58,704 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:56:59,523 epoch 9 - iter 14/146 - loss 0.00714639 - time (sec): 0.82 - samples/sec: 4829.29 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 20:57:00,417 epoch 9 - iter 28/146 - loss 0.01070620 - time (sec): 1.71 - samples/sec: 4815.29 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 20:57:01,393 epoch 9 - iter 42/146 - loss 0.01020899 - time (sec): 2.69 - samples/sec: 4600.06 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 20:57:02,262 epoch 9 - iter 56/146 - loss 0.01078880 - time (sec): 3.56 - samples/sec: 4544.91 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 20:57:03,140 epoch 9 - iter 70/146 - loss 0.00974936 - time (sec): 4.43 - samples/sec: 4582.25 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 20:57:04,103 epoch 9 - iter 84/146 - loss 0.00881240 - time (sec): 5.40 - samples/sec: 4663.52 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 20:57:05,042 epoch 9 - iter 98/146 - loss 0.00916716 - time (sec): 6.34 - samples/sec: 4686.47 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 20:57:05,964 epoch 9 - iter 112/146 - loss 0.01197895 - time (sec): 7.26 - samples/sec: 4663.61 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 20:57:06,949 epoch 9 - iter 126/146 - loss 0.01172101 - time (sec): 8.24 - samples/sec: 4651.19 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 20:57:07,864 epoch 9 - iter 140/146 - loss 0.01212810 - time (sec): 9.16 - samples/sec: 4660.42 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 20:57:08,219 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:57:08,219 EPOCH 9 done: loss 0.0121 - lr: 0.000004 |
|
2023-10-25 20:57:09,147 DEV : loss 0.1434531956911087 - f1-score (micro avg) 0.7716 |
|
2023-10-25 20:57:09,152 saving best model |
|
2023-10-25 20:57:09,638 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:57:10,468 epoch 10 - iter 14/146 - loss 0.00731567 - time (sec): 0.83 - samples/sec: 4582.21 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 20:57:11,271 epoch 10 - iter 28/146 - loss 0.00696944 - time (sec): 1.63 - samples/sec: 4587.30 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 20:57:12,142 epoch 10 - iter 42/146 - loss 0.00664134 - time (sec): 2.50 - samples/sec: 4552.34 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 20:57:13,223 epoch 10 - iter 56/146 - loss 0.00954688 - time (sec): 3.58 - samples/sec: 4722.65 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 20:57:14,214 epoch 10 - iter 70/146 - loss 0.00875735 - time (sec): 4.57 - samples/sec: 4665.89 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 20:57:15,226 epoch 10 - iter 84/146 - loss 0.00886044 - time (sec): 5.59 - samples/sec: 4755.06 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 20:57:16,179 epoch 10 - iter 98/146 - loss 0.00861602 - time (sec): 6.54 - samples/sec: 4786.47 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 20:57:16,985 epoch 10 - iter 112/146 - loss 0.00925849 - time (sec): 7.34 - samples/sec: 4765.56 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 20:57:17,802 epoch 10 - iter 126/146 - loss 0.00860254 - time (sec): 8.16 - samples/sec: 4739.79 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 20:57:18,595 epoch 10 - iter 140/146 - loss 0.00889062 - time (sec): 8.95 - samples/sec: 4786.41 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 20:57:18,928 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:57:18,928 EPOCH 10 done: loss 0.0089 - lr: 0.000000 |
|
2023-10-25 20:57:19,852 DEV : loss 0.14739026129245758 - f1-score (micro avg) 0.7732 |
|
2023-10-25 20:57:19,857 saving best model |
|
2023-10-25 20:57:21,031 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:57:21,032 Loading model from best epoch ... |
|
2023-10-25 20:57:22,663 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 20:57:24,209 |
|
Results: |
|
- F-score (micro) 0.7561 |
|
- F-score (macro) 0.6892 |
|
- Accuracy 0.634 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7855 0.8420 0.8128 348 |
|
LOC 0.6817 0.8123 0.7413 261 |
|
ORG 0.4375 0.4038 0.4200 52 |
|
HumanProd 0.7500 0.8182 0.7826 22 |
|
|
|
micro avg 0.7196 0.7965 0.7561 683 |
|
macro avg 0.6637 0.7191 0.6892 683 |
|
weighted avg 0.7182 0.7965 0.7546 683 |
|
|
|
2023-10-25 20:57:24,210 ---------------------------------------------------------------------------------------------------- |
|
|