|
2023-10-25 15:07:16,252 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:07:16,253 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 15:07:16,253 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:07:16,254 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-25 15:07:16,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:07:16,254 Train: 7142 sentences |
|
2023-10-25 15:07:16,254 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 15:07:16,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:07:16,254 Training Params: |
|
2023-10-25 15:07:16,254 - learning_rate: "3e-05" |
|
2023-10-25 15:07:16,254 - mini_batch_size: "8" |
|
2023-10-25 15:07:16,254 - max_epochs: "10" |
|
2023-10-25 15:07:16,254 - shuffle: "True" |
|
2023-10-25 15:07:16,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:07:16,254 Plugins: |
|
2023-10-25 15:07:16,254 - TensorboardLogger |
|
2023-10-25 15:07:16,254 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 15:07:16,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:07:16,254 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 15:07:16,254 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 15:07:16,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:07:16,254 Computation: |
|
2023-10-25 15:07:16,254 - compute on device: cuda:0 |
|
2023-10-25 15:07:16,254 - embedding storage: none |
|
2023-10-25 15:07:16,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:07:16,254 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-25 15:07:16,255 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:07:16,255 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:07:16,255 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 15:07:22,287 epoch 1 - iter 89/893 - loss 2.32679756 - time (sec): 6.03 - samples/sec: 4228.16 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 15:07:27,984 epoch 1 - iter 178/893 - loss 1.51396462 - time (sec): 11.73 - samples/sec: 4163.39 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 15:07:33,714 epoch 1 - iter 267/893 - loss 1.14478279 - time (sec): 17.46 - samples/sec: 4138.90 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 15:07:39,758 epoch 1 - iter 356/893 - loss 0.92624548 - time (sec): 23.50 - samples/sec: 4118.92 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 15:07:45,647 epoch 1 - iter 445/893 - loss 0.77920214 - time (sec): 29.39 - samples/sec: 4149.48 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 15:07:51,499 epoch 1 - iter 534/893 - loss 0.67179663 - time (sec): 35.24 - samples/sec: 4208.38 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 15:07:57,019 epoch 1 - iter 623/893 - loss 0.60130729 - time (sec): 40.76 - samples/sec: 4247.79 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 15:08:02,490 epoch 1 - iter 712/893 - loss 0.54657076 - time (sec): 46.23 - samples/sec: 4282.03 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 15:08:07,955 epoch 1 - iter 801/893 - loss 0.50230078 - time (sec): 51.70 - samples/sec: 4306.93 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 15:08:13,549 epoch 1 - iter 890/893 - loss 0.46653814 - time (sec): 57.29 - samples/sec: 4330.30 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 15:08:13,712 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:08:13,712 EPOCH 1 done: loss 0.4656 - lr: 0.000030 |
|
2023-10-25 15:08:17,345 DEV : loss 0.1060444563627243 - f1-score (micro avg) 0.7387 |
|
2023-10-25 15:08:17,369 saving best model |
|
2023-10-25 15:08:17,905 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:08:23,847 epoch 2 - iter 89/893 - loss 0.11786204 - time (sec): 5.94 - samples/sec: 4321.04 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 15:08:29,333 epoch 2 - iter 178/893 - loss 0.11830957 - time (sec): 11.43 - samples/sec: 4126.75 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 15:08:35,534 epoch 2 - iter 267/893 - loss 0.11117703 - time (sec): 17.63 - samples/sec: 4187.28 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 15:08:41,384 epoch 2 - iter 356/893 - loss 0.11088492 - time (sec): 23.48 - samples/sec: 4194.66 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 15:08:47,307 epoch 2 - iter 445/893 - loss 0.10753992 - time (sec): 29.40 - samples/sec: 4218.91 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 15:08:53,058 epoch 2 - iter 534/893 - loss 0.10789428 - time (sec): 35.15 - samples/sec: 4232.20 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 15:08:58,673 epoch 2 - iter 623/893 - loss 0.10558977 - time (sec): 40.77 - samples/sec: 4289.97 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 15:09:04,088 epoch 2 - iter 712/893 - loss 0.10375529 - time (sec): 46.18 - samples/sec: 4271.55 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 15:09:09,613 epoch 2 - iter 801/893 - loss 0.10372405 - time (sec): 51.71 - samples/sec: 4305.77 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 15:09:15,240 epoch 2 - iter 890/893 - loss 0.10324515 - time (sec): 57.33 - samples/sec: 4321.41 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 15:09:15,429 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:09:15,430 EPOCH 2 done: loss 0.1031 - lr: 0.000027 |
|
2023-10-25 15:09:20,245 DEV : loss 0.09593858569860458 - f1-score (micro avg) 0.777 |
|
2023-10-25 15:09:20,268 saving best model |
|
2023-10-25 15:09:20,917 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:09:26,476 epoch 3 - iter 89/893 - loss 0.06107853 - time (sec): 5.56 - samples/sec: 4578.32 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 15:09:31,970 epoch 3 - iter 178/893 - loss 0.05655927 - time (sec): 11.05 - samples/sec: 4428.83 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 15:09:37,621 epoch 3 - iter 267/893 - loss 0.05765556 - time (sec): 16.70 - samples/sec: 4489.95 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 15:09:43,134 epoch 3 - iter 356/893 - loss 0.05899019 - time (sec): 22.22 - samples/sec: 4487.90 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 15:09:48,684 epoch 3 - iter 445/893 - loss 0.06059285 - time (sec): 27.77 - samples/sec: 4430.84 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 15:09:54,437 epoch 3 - iter 534/893 - loss 0.06203883 - time (sec): 33.52 - samples/sec: 4406.62 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 15:10:00,394 epoch 3 - iter 623/893 - loss 0.06228959 - time (sec): 39.48 - samples/sec: 4403.41 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 15:10:06,271 epoch 3 - iter 712/893 - loss 0.06223592 - time (sec): 45.35 - samples/sec: 4403.81 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 15:10:12,116 epoch 3 - iter 801/893 - loss 0.06176464 - time (sec): 51.20 - samples/sec: 4397.44 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 15:10:17,781 epoch 3 - iter 890/893 - loss 0.06140542 - time (sec): 56.86 - samples/sec: 4364.69 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 15:10:17,965 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:10:17,965 EPOCH 3 done: loss 0.0613 - lr: 0.000023 |
|
2023-10-25 15:10:22,849 DEV : loss 0.10392870754003525 - f1-score (micro avg) 0.7824 |
|
2023-10-25 15:10:22,870 saving best model |
|
2023-10-25 15:10:23,572 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:10:29,407 epoch 4 - iter 89/893 - loss 0.04410301 - time (sec): 5.83 - samples/sec: 4278.04 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 15:10:35,183 epoch 4 - iter 178/893 - loss 0.04544728 - time (sec): 11.61 - samples/sec: 4301.00 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 15:10:40,777 epoch 4 - iter 267/893 - loss 0.04597693 - time (sec): 17.20 - samples/sec: 4268.80 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 15:10:46,358 epoch 4 - iter 356/893 - loss 0.04537082 - time (sec): 22.78 - samples/sec: 4353.99 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 15:10:52,237 epoch 4 - iter 445/893 - loss 0.04624815 - time (sec): 28.66 - samples/sec: 4335.26 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 15:10:58,294 epoch 4 - iter 534/893 - loss 0.04475990 - time (sec): 34.72 - samples/sec: 4348.02 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 15:11:04,125 epoch 4 - iter 623/893 - loss 0.04559760 - time (sec): 40.55 - samples/sec: 4316.02 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 15:11:09,950 epoch 4 - iter 712/893 - loss 0.04461195 - time (sec): 46.38 - samples/sec: 4271.43 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 15:11:15,961 epoch 4 - iter 801/893 - loss 0.04371497 - time (sec): 52.39 - samples/sec: 4285.47 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 15:11:21,628 epoch 4 - iter 890/893 - loss 0.04355757 - time (sec): 58.05 - samples/sec: 4275.23 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 15:11:21,806 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:11:21,807 EPOCH 4 done: loss 0.0437 - lr: 0.000020 |
|
2023-10-25 15:11:25,871 DEV : loss 0.1405394971370697 - f1-score (micro avg) 0.7739 |
|
2023-10-25 15:11:25,895 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:11:31,806 epoch 5 - iter 89/893 - loss 0.03086483 - time (sec): 5.91 - samples/sec: 4018.56 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 15:11:37,763 epoch 5 - iter 178/893 - loss 0.03408901 - time (sec): 11.87 - samples/sec: 4169.34 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 15:11:43,934 epoch 5 - iter 267/893 - loss 0.03368610 - time (sec): 18.04 - samples/sec: 4168.56 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 15:11:49,742 epoch 5 - iter 356/893 - loss 0.03342150 - time (sec): 23.84 - samples/sec: 4171.56 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 15:11:55,551 epoch 5 - iter 445/893 - loss 0.03366136 - time (sec): 29.65 - samples/sec: 4169.40 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 15:12:01,573 epoch 5 - iter 534/893 - loss 0.03390282 - time (sec): 35.68 - samples/sec: 4202.78 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 15:12:07,368 epoch 5 - iter 623/893 - loss 0.03357706 - time (sec): 41.47 - samples/sec: 4197.80 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 15:12:13,096 epoch 5 - iter 712/893 - loss 0.03266215 - time (sec): 47.20 - samples/sec: 4199.37 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 15:12:18,874 epoch 5 - iter 801/893 - loss 0.03391376 - time (sec): 52.98 - samples/sec: 4207.00 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 15:12:24,683 epoch 5 - iter 890/893 - loss 0.03419362 - time (sec): 58.79 - samples/sec: 4219.53 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 15:12:24,885 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:12:24,885 EPOCH 5 done: loss 0.0341 - lr: 0.000017 |
|
2023-10-25 15:12:29,919 DEV : loss 0.16618064045906067 - f1-score (micro avg) 0.8051 |
|
2023-10-25 15:12:29,940 saving best model |
|
2023-10-25 15:12:30,620 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:12:36,386 epoch 6 - iter 89/893 - loss 0.01934906 - time (sec): 5.76 - samples/sec: 4367.06 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 15:12:42,034 epoch 6 - iter 178/893 - loss 0.01971571 - time (sec): 11.41 - samples/sec: 4277.35 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 15:12:48,099 epoch 6 - iter 267/893 - loss 0.01891099 - time (sec): 17.48 - samples/sec: 4240.69 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 15:12:54,052 epoch 6 - iter 356/893 - loss 0.02463843 - time (sec): 23.43 - samples/sec: 4232.23 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 15:12:59,977 epoch 6 - iter 445/893 - loss 0.02485032 - time (sec): 29.35 - samples/sec: 4266.10 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 15:13:05,810 epoch 6 - iter 534/893 - loss 0.02602922 - time (sec): 35.19 - samples/sec: 4217.86 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 15:13:11,965 epoch 6 - iter 623/893 - loss 0.02550398 - time (sec): 41.34 - samples/sec: 4206.73 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 15:13:17,850 epoch 6 - iter 712/893 - loss 0.02490406 - time (sec): 47.23 - samples/sec: 4219.72 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 15:13:23,687 epoch 6 - iter 801/893 - loss 0.02516364 - time (sec): 53.06 - samples/sec: 4220.25 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 15:13:29,433 epoch 6 - iter 890/893 - loss 0.02571510 - time (sec): 58.81 - samples/sec: 4220.50 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 15:13:29,621 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:13:29,622 EPOCH 6 done: loss 0.0258 - lr: 0.000013 |
|
2023-10-25 15:13:34,396 DEV : loss 0.17371046543121338 - f1-score (micro avg) 0.8112 |
|
2023-10-25 15:13:34,417 saving best model |
|
2023-10-25 15:13:35,072 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:13:41,193 epoch 7 - iter 89/893 - loss 0.02011470 - time (sec): 6.12 - samples/sec: 4355.79 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 15:13:47,009 epoch 7 - iter 178/893 - loss 0.02325248 - time (sec): 11.94 - samples/sec: 4245.96 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 15:13:52,906 epoch 7 - iter 267/893 - loss 0.02032808 - time (sec): 17.83 - samples/sec: 4206.61 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 15:13:58,744 epoch 7 - iter 356/893 - loss 0.02038788 - time (sec): 23.67 - samples/sec: 4235.40 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 15:14:04,629 epoch 7 - iter 445/893 - loss 0.01997941 - time (sec): 29.56 - samples/sec: 4217.35 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 15:14:10,590 epoch 7 - iter 534/893 - loss 0.01925592 - time (sec): 35.52 - samples/sec: 4185.73 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 15:14:16,316 epoch 7 - iter 623/893 - loss 0.01978486 - time (sec): 41.24 - samples/sec: 4162.69 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 15:14:22,572 epoch 7 - iter 712/893 - loss 0.01991371 - time (sec): 47.50 - samples/sec: 4168.20 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 15:14:28,265 epoch 7 - iter 801/893 - loss 0.01994353 - time (sec): 53.19 - samples/sec: 4181.97 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 15:14:34,108 epoch 7 - iter 890/893 - loss 0.02008156 - time (sec): 59.03 - samples/sec: 4202.26 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 15:14:34,306 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:14:34,306 EPOCH 7 done: loss 0.0200 - lr: 0.000010 |
|
2023-10-25 15:14:38,320 DEV : loss 0.17937816679477692 - f1-score (micro avg) 0.8123 |
|
2023-10-25 15:14:38,342 saving best model |
|
2023-10-25 15:14:39,015 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:14:44,770 epoch 8 - iter 89/893 - loss 0.01881002 - time (sec): 5.75 - samples/sec: 4163.23 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 15:14:50,519 epoch 8 - iter 178/893 - loss 0.01785540 - time (sec): 11.50 - samples/sec: 4223.67 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 15:14:56,466 epoch 8 - iter 267/893 - loss 0.01663373 - time (sec): 17.45 - samples/sec: 4239.61 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 15:15:02,200 epoch 8 - iter 356/893 - loss 0.01641502 - time (sec): 23.18 - samples/sec: 4212.11 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 15:15:08,033 epoch 8 - iter 445/893 - loss 0.01638939 - time (sec): 29.02 - samples/sec: 4217.12 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 15:15:14,046 epoch 8 - iter 534/893 - loss 0.01532943 - time (sec): 35.03 - samples/sec: 4218.47 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 15:15:20,415 epoch 8 - iter 623/893 - loss 0.01502360 - time (sec): 41.40 - samples/sec: 4207.79 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 15:15:26,296 epoch 8 - iter 712/893 - loss 0.01549364 - time (sec): 47.28 - samples/sec: 4180.53 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 15:15:32,060 epoch 8 - iter 801/893 - loss 0.01600174 - time (sec): 53.04 - samples/sec: 4193.62 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 15:15:38,072 epoch 8 - iter 890/893 - loss 0.01593321 - time (sec): 59.06 - samples/sec: 4200.02 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 15:15:38,267 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:15:38,267 EPOCH 8 done: loss 0.0159 - lr: 0.000007 |
|
2023-10-25 15:15:43,263 DEV : loss 0.21227356791496277 - f1-score (micro avg) 0.7971 |
|
2023-10-25 15:15:43,284 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:15:49,118 epoch 9 - iter 89/893 - loss 0.00920994 - time (sec): 5.83 - samples/sec: 4231.87 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 15:15:54,872 epoch 9 - iter 178/893 - loss 0.00891932 - time (sec): 11.59 - samples/sec: 4226.85 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 15:16:00,508 epoch 9 - iter 267/893 - loss 0.01039436 - time (sec): 17.22 - samples/sec: 4278.15 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 15:16:06,782 epoch 9 - iter 356/893 - loss 0.01056567 - time (sec): 23.50 - samples/sec: 4283.75 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 15:16:12,664 epoch 9 - iter 445/893 - loss 0.01205154 - time (sec): 29.38 - samples/sec: 4283.65 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 15:16:18,609 epoch 9 - iter 534/893 - loss 0.01179973 - time (sec): 35.32 - samples/sec: 4285.62 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 15:16:24,456 epoch 9 - iter 623/893 - loss 0.01148151 - time (sec): 41.17 - samples/sec: 4253.07 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 15:16:30,157 epoch 9 - iter 712/893 - loss 0.01108116 - time (sec): 46.87 - samples/sec: 4279.34 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 15:16:35,747 epoch 9 - iter 801/893 - loss 0.01083303 - time (sec): 52.46 - samples/sec: 4280.96 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 15:16:41,190 epoch 9 - iter 890/893 - loss 0.01082633 - time (sec): 57.90 - samples/sec: 4284.21 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 15:16:41,365 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:16:41,366 EPOCH 9 done: loss 0.0108 - lr: 0.000003 |
|
2023-10-25 15:16:46,208 DEV : loss 0.21176157891750336 - f1-score (micro avg) 0.8104 |
|
2023-10-25 15:16:46,230 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:16:51,777 epoch 10 - iter 89/893 - loss 0.01317723 - time (sec): 5.55 - samples/sec: 4308.65 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 15:16:57,614 epoch 10 - iter 178/893 - loss 0.00887501 - time (sec): 11.38 - samples/sec: 4360.90 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 15:17:03,568 epoch 10 - iter 267/893 - loss 0.00690912 - time (sec): 17.34 - samples/sec: 4311.16 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 15:17:09,586 epoch 10 - iter 356/893 - loss 0.00707851 - time (sec): 23.35 - samples/sec: 4278.17 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 15:17:15,125 epoch 10 - iter 445/893 - loss 0.00657208 - time (sec): 28.89 - samples/sec: 4225.74 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 15:17:20,950 epoch 10 - iter 534/893 - loss 0.00656213 - time (sec): 34.72 - samples/sec: 4263.47 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 15:17:26,715 epoch 10 - iter 623/893 - loss 0.00748677 - time (sec): 40.48 - samples/sec: 4268.60 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 15:17:32,254 epoch 10 - iter 712/893 - loss 0.00756648 - time (sec): 46.02 - samples/sec: 4296.21 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 15:17:37,995 epoch 10 - iter 801/893 - loss 0.00771944 - time (sec): 51.76 - samples/sec: 4279.89 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 15:17:43,827 epoch 10 - iter 890/893 - loss 0.00812425 - time (sec): 57.60 - samples/sec: 4307.35 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 15:17:43,999 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:17:43,999 EPOCH 10 done: loss 0.0081 - lr: 0.000000 |
|
2023-10-25 15:17:47,978 DEV : loss 0.21720275282859802 - f1-score (micro avg) 0.8147 |
|
2023-10-25 15:17:47,999 saving best model |
|
2023-10-25 15:17:49,119 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:17:49,120 Loading model from best epoch ... |
|
2023-10-25 15:17:51,035 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 15:18:03,575 |
|
Results: |
|
- F-score (micro) 0.6992 |
|
- F-score (macro) 0.6245 |
|
- Accuracy 0.5561 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7038 0.6986 0.7012 1095 |
|
PER 0.7808 0.7816 0.7812 1012 |
|
ORG 0.4549 0.5798 0.5099 357 |
|
HumanProd 0.4074 0.6667 0.5057 33 |
|
|
|
micro avg 0.6842 0.7149 0.6992 2497 |
|
macro avg 0.5867 0.6817 0.6245 2497 |
|
weighted avg 0.6955 0.7149 0.7037 2497 |
|
|
|
2023-10-25 15:18:03,576 ---------------------------------------------------------------------------------------------------- |
|
|