stefan-it's picture
Upload folder using huggingface_hub
ff1471d
2023-10-13 09:19:43,669 ----------------------------------------------------------------------------------------------------
2023-10-13 09:19:43,670 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 09:19:43,670 ----------------------------------------------------------------------------------------------------
2023-10-13 09:19:43,671 MultiCorpus: 1214 train + 266 dev + 251 test sentences
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator
2023-10-13 09:19:43,671 ----------------------------------------------------------------------------------------------------
2023-10-13 09:19:43,671 Train: 1214 sentences
2023-10-13 09:19:43,671 (train_with_dev=False, train_with_test=False)
2023-10-13 09:19:43,671 ----------------------------------------------------------------------------------------------------
2023-10-13 09:19:43,671 Training Params:
2023-10-13 09:19:43,671 - learning_rate: "5e-05"
2023-10-13 09:19:43,671 - mini_batch_size: "8"
2023-10-13 09:19:43,671 - max_epochs: "10"
2023-10-13 09:19:43,671 - shuffle: "True"
2023-10-13 09:19:43,671 ----------------------------------------------------------------------------------------------------
2023-10-13 09:19:43,671 Plugins:
2023-10-13 09:19:43,671 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 09:19:43,671 ----------------------------------------------------------------------------------------------------
2023-10-13 09:19:43,671 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 09:19:43,671 - metric: "('micro avg', 'f1-score')"
2023-10-13 09:19:43,671 ----------------------------------------------------------------------------------------------------
2023-10-13 09:19:43,671 Computation:
2023-10-13 09:19:43,671 - compute on device: cuda:0
2023-10-13 09:19:43,671 - embedding storage: none
2023-10-13 09:19:43,671 ----------------------------------------------------------------------------------------------------
2023-10-13 09:19:43,671 Model training base path: "hmbench-ajmc/en-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-13 09:19:43,671 ----------------------------------------------------------------------------------------------------
2023-10-13 09:19:43,671 ----------------------------------------------------------------------------------------------------
2023-10-13 09:19:44,571 epoch 1 - iter 15/152 - loss 3.38346249 - time (sec): 0.90 - samples/sec: 3419.08 - lr: 0.000005 - momentum: 0.000000
2023-10-13 09:19:45,445 epoch 1 - iter 30/152 - loss 2.97532840 - time (sec): 1.77 - samples/sec: 3566.23 - lr: 0.000010 - momentum: 0.000000
2023-10-13 09:19:46,325 epoch 1 - iter 45/152 - loss 2.30066272 - time (sec): 2.65 - samples/sec: 3550.93 - lr: 0.000014 - momentum: 0.000000
2023-10-13 09:19:47,196 epoch 1 - iter 60/152 - loss 1.93308055 - time (sec): 3.52 - samples/sec: 3483.74 - lr: 0.000019 - momentum: 0.000000
2023-10-13 09:19:48,072 epoch 1 - iter 75/152 - loss 1.68388845 - time (sec): 4.40 - samples/sec: 3525.22 - lr: 0.000024 - momentum: 0.000000
2023-10-13 09:19:48,878 epoch 1 - iter 90/152 - loss 1.51097794 - time (sec): 5.21 - samples/sec: 3539.24 - lr: 0.000029 - momentum: 0.000000
2023-10-13 09:19:49,683 epoch 1 - iter 105/152 - loss 1.35572155 - time (sec): 6.01 - samples/sec: 3554.54 - lr: 0.000034 - momentum: 0.000000
2023-10-13 09:19:50,520 epoch 1 - iter 120/152 - loss 1.23843244 - time (sec): 6.85 - samples/sec: 3563.92 - lr: 0.000039 - momentum: 0.000000
2023-10-13 09:19:51,333 epoch 1 - iter 135/152 - loss 1.13922293 - time (sec): 7.66 - samples/sec: 3570.55 - lr: 0.000044 - momentum: 0.000000
2023-10-13 09:19:52,184 epoch 1 - iter 150/152 - loss 1.05281381 - time (sec): 8.51 - samples/sec: 3593.60 - lr: 0.000049 - momentum: 0.000000
2023-10-13 09:19:52,289 ----------------------------------------------------------------------------------------------------
2023-10-13 09:19:52,289 EPOCH 1 done: loss 1.0435 - lr: 0.000049
2023-10-13 09:19:53,011 DEV : loss 0.2644125819206238 - f1-score (micro avg) 0.4632
2023-10-13 09:19:53,017 saving best model
2023-10-13 09:19:53,379 ----------------------------------------------------------------------------------------------------
2023-10-13 09:19:54,217 epoch 2 - iter 15/152 - loss 0.31275015 - time (sec): 0.84 - samples/sec: 3557.92 - lr: 0.000049 - momentum: 0.000000
2023-10-13 09:19:55,067 epoch 2 - iter 30/152 - loss 0.26855105 - time (sec): 1.69 - samples/sec: 3571.51 - lr: 0.000049 - momentum: 0.000000
2023-10-13 09:19:55,949 epoch 2 - iter 45/152 - loss 0.22728465 - time (sec): 2.57 - samples/sec: 3533.92 - lr: 0.000048 - momentum: 0.000000
2023-10-13 09:19:56,823 epoch 2 - iter 60/152 - loss 0.22107953 - time (sec): 3.44 - samples/sec: 3526.53 - lr: 0.000048 - momentum: 0.000000
2023-10-13 09:19:57,706 epoch 2 - iter 75/152 - loss 0.20765106 - time (sec): 4.32 - samples/sec: 3548.73 - lr: 0.000047 - momentum: 0.000000
2023-10-13 09:19:58,576 epoch 2 - iter 90/152 - loss 0.19761933 - time (sec): 5.19 - samples/sec: 3514.58 - lr: 0.000047 - momentum: 0.000000
2023-10-13 09:19:59,451 epoch 2 - iter 105/152 - loss 0.19040845 - time (sec): 6.07 - samples/sec: 3549.42 - lr: 0.000046 - momentum: 0.000000
2023-10-13 09:20:00,277 epoch 2 - iter 120/152 - loss 0.18038944 - time (sec): 6.90 - samples/sec: 3584.85 - lr: 0.000046 - momentum: 0.000000
2023-10-13 09:20:01,179 epoch 2 - iter 135/152 - loss 0.17918419 - time (sec): 7.80 - samples/sec: 3565.51 - lr: 0.000045 - momentum: 0.000000
2023-10-13 09:20:02,016 epoch 2 - iter 150/152 - loss 0.17422562 - time (sec): 8.63 - samples/sec: 3565.44 - lr: 0.000045 - momentum: 0.000000
2023-10-13 09:20:02,114 ----------------------------------------------------------------------------------------------------
2023-10-13 09:20:02,114 EPOCH 2 done: loss 0.1741 - lr: 0.000045
2023-10-13 09:20:03,069 DEV : loss 0.1539154350757599 - f1-score (micro avg) 0.7722
2023-10-13 09:20:03,076 saving best model
2023-10-13 09:20:03,569 ----------------------------------------------------------------------------------------------------
2023-10-13 09:20:04,426 epoch 3 - iter 15/152 - loss 0.06608612 - time (sec): 0.85 - samples/sec: 3559.99 - lr: 0.000044 - momentum: 0.000000
2023-10-13 09:20:05,293 epoch 3 - iter 30/152 - loss 0.07392310 - time (sec): 1.72 - samples/sec: 3537.93 - lr: 0.000043 - momentum: 0.000000
2023-10-13 09:20:06,137 epoch 3 - iter 45/152 - loss 0.08018555 - time (sec): 2.57 - samples/sec: 3659.77 - lr: 0.000043 - momentum: 0.000000
2023-10-13 09:20:06,958 epoch 3 - iter 60/152 - loss 0.07734541 - time (sec): 3.39 - samples/sec: 3636.50 - lr: 0.000042 - momentum: 0.000000
2023-10-13 09:20:07,830 epoch 3 - iter 75/152 - loss 0.07945579 - time (sec): 4.26 - samples/sec: 3596.63 - lr: 0.000042 - momentum: 0.000000
2023-10-13 09:20:08,808 epoch 3 - iter 90/152 - loss 0.08341963 - time (sec): 5.24 - samples/sec: 3473.18 - lr: 0.000041 - momentum: 0.000000
2023-10-13 09:20:09,650 epoch 3 - iter 105/152 - loss 0.08781815 - time (sec): 6.08 - samples/sec: 3512.66 - lr: 0.000041 - momentum: 0.000000
2023-10-13 09:20:10,500 epoch 3 - iter 120/152 - loss 0.09061035 - time (sec): 6.93 - samples/sec: 3514.87 - lr: 0.000040 - momentum: 0.000000
2023-10-13 09:20:11,382 epoch 3 - iter 135/152 - loss 0.08938512 - time (sec): 7.81 - samples/sec: 3554.72 - lr: 0.000040 - momentum: 0.000000
2023-10-13 09:20:12,194 epoch 3 - iter 150/152 - loss 0.08918940 - time (sec): 8.62 - samples/sec: 3543.13 - lr: 0.000039 - momentum: 0.000000
2023-10-13 09:20:12,304 ----------------------------------------------------------------------------------------------------
2023-10-13 09:20:12,304 EPOCH 3 done: loss 0.0887 - lr: 0.000039
2023-10-13 09:20:13,250 DEV : loss 0.14931613206863403 - f1-score (micro avg) 0.7976
2023-10-13 09:20:13,256 saving best model
2023-10-13 09:20:13,766 ----------------------------------------------------------------------------------------------------
2023-10-13 09:20:14,611 epoch 4 - iter 15/152 - loss 0.05315868 - time (sec): 0.84 - samples/sec: 3801.87 - lr: 0.000038 - momentum: 0.000000
2023-10-13 09:20:15,460 epoch 4 - iter 30/152 - loss 0.04090943 - time (sec): 1.69 - samples/sec: 3711.76 - lr: 0.000038 - momentum: 0.000000
2023-10-13 09:20:16,299 epoch 4 - iter 45/152 - loss 0.05188231 - time (sec): 2.53 - samples/sec: 3657.67 - lr: 0.000037 - momentum: 0.000000
2023-10-13 09:20:17,132 epoch 4 - iter 60/152 - loss 0.04889130 - time (sec): 3.36 - samples/sec: 3617.32 - lr: 0.000037 - momentum: 0.000000
2023-10-13 09:20:17,984 epoch 4 - iter 75/152 - loss 0.04414436 - time (sec): 4.22 - samples/sec: 3641.00 - lr: 0.000036 - momentum: 0.000000
2023-10-13 09:20:18,819 epoch 4 - iter 90/152 - loss 0.05113613 - time (sec): 5.05 - samples/sec: 3632.71 - lr: 0.000036 - momentum: 0.000000
2023-10-13 09:20:19,649 epoch 4 - iter 105/152 - loss 0.05408777 - time (sec): 5.88 - samples/sec: 3641.72 - lr: 0.000035 - momentum: 0.000000
2023-10-13 09:20:20,528 epoch 4 - iter 120/152 - loss 0.05730074 - time (sec): 6.76 - samples/sec: 3637.28 - lr: 0.000035 - momentum: 0.000000
2023-10-13 09:20:21,404 epoch 4 - iter 135/152 - loss 0.05705748 - time (sec): 7.64 - samples/sec: 3626.22 - lr: 0.000034 - momentum: 0.000000
2023-10-13 09:20:22,243 epoch 4 - iter 150/152 - loss 0.06222920 - time (sec): 8.48 - samples/sec: 3609.78 - lr: 0.000034 - momentum: 0.000000
2023-10-13 09:20:22,350 ----------------------------------------------------------------------------------------------------
2023-10-13 09:20:22,351 EPOCH 4 done: loss 0.0616 - lr: 0.000034
2023-10-13 09:20:23,305 DEV : loss 0.16867592930793762 - f1-score (micro avg) 0.8
2023-10-13 09:20:23,311 saving best model
2023-10-13 09:20:23,760 ----------------------------------------------------------------------------------------------------
2023-10-13 09:20:24,630 epoch 5 - iter 15/152 - loss 0.04383519 - time (sec): 0.87 - samples/sec: 3661.10 - lr: 0.000033 - momentum: 0.000000
2023-10-13 09:20:25,465 epoch 5 - iter 30/152 - loss 0.03552252 - time (sec): 1.70 - samples/sec: 3756.11 - lr: 0.000032 - momentum: 0.000000
2023-10-13 09:20:26,331 epoch 5 - iter 45/152 - loss 0.03538290 - time (sec): 2.57 - samples/sec: 3672.93 - lr: 0.000032 - momentum: 0.000000
2023-10-13 09:20:27,159 epoch 5 - iter 60/152 - loss 0.03921473 - time (sec): 3.40 - samples/sec: 3671.03 - lr: 0.000031 - momentum: 0.000000
2023-10-13 09:20:27,996 epoch 5 - iter 75/152 - loss 0.04800286 - time (sec): 4.23 - samples/sec: 3665.66 - lr: 0.000031 - momentum: 0.000000
2023-10-13 09:20:28,819 epoch 5 - iter 90/152 - loss 0.04586651 - time (sec): 5.06 - samples/sec: 3674.82 - lr: 0.000030 - momentum: 0.000000
2023-10-13 09:20:29,665 epoch 5 - iter 105/152 - loss 0.04281224 - time (sec): 5.90 - samples/sec: 3647.66 - lr: 0.000030 - momentum: 0.000000
2023-10-13 09:20:30,524 epoch 5 - iter 120/152 - loss 0.04804606 - time (sec): 6.76 - samples/sec: 3642.15 - lr: 0.000029 - momentum: 0.000000
2023-10-13 09:20:31,360 epoch 5 - iter 135/152 - loss 0.04923516 - time (sec): 7.60 - samples/sec: 3633.97 - lr: 0.000029 - momentum: 0.000000
2023-10-13 09:20:32,223 epoch 5 - iter 150/152 - loss 0.04699684 - time (sec): 8.46 - samples/sec: 3624.02 - lr: 0.000028 - momentum: 0.000000
2023-10-13 09:20:32,321 ----------------------------------------------------------------------------------------------------
2023-10-13 09:20:32,321 EPOCH 5 done: loss 0.0469 - lr: 0.000028
2023-10-13 09:20:33,288 DEV : loss 0.18302051723003387 - f1-score (micro avg) 0.8273
2023-10-13 09:20:33,295 saving best model
2023-10-13 09:20:33,776 ----------------------------------------------------------------------------------------------------
2023-10-13 09:20:34,584 epoch 6 - iter 15/152 - loss 0.03845158 - time (sec): 0.80 - samples/sec: 3486.72 - lr: 0.000027 - momentum: 0.000000
2023-10-13 09:20:35,479 epoch 6 - iter 30/152 - loss 0.03533511 - time (sec): 1.70 - samples/sec: 3567.43 - lr: 0.000027 - momentum: 0.000000
2023-10-13 09:20:36,340 epoch 6 - iter 45/152 - loss 0.02951617 - time (sec): 2.56 - samples/sec: 3560.03 - lr: 0.000026 - momentum: 0.000000
2023-10-13 09:20:37,227 epoch 6 - iter 60/152 - loss 0.03082758 - time (sec): 3.45 - samples/sec: 3538.12 - lr: 0.000026 - momentum: 0.000000
2023-10-13 09:20:38,020 epoch 6 - iter 75/152 - loss 0.02910731 - time (sec): 4.24 - samples/sec: 3532.66 - lr: 0.000025 - momentum: 0.000000
2023-10-13 09:20:38,862 epoch 6 - iter 90/152 - loss 0.02763532 - time (sec): 5.08 - samples/sec: 3558.85 - lr: 0.000025 - momentum: 0.000000
2023-10-13 09:20:39,703 epoch 6 - iter 105/152 - loss 0.03064225 - time (sec): 5.92 - samples/sec: 3592.20 - lr: 0.000024 - momentum: 0.000000
2023-10-13 09:20:40,560 epoch 6 - iter 120/152 - loss 0.03447101 - time (sec): 6.78 - samples/sec: 3588.74 - lr: 0.000024 - momentum: 0.000000
2023-10-13 09:20:41,371 epoch 6 - iter 135/152 - loss 0.03335287 - time (sec): 7.59 - samples/sec: 3601.34 - lr: 0.000023 - momentum: 0.000000
2023-10-13 09:20:42,205 epoch 6 - iter 150/152 - loss 0.03378038 - time (sec): 8.42 - samples/sec: 3635.78 - lr: 0.000022 - momentum: 0.000000
2023-10-13 09:20:42,306 ----------------------------------------------------------------------------------------------------
2023-10-13 09:20:42,306 EPOCH 6 done: loss 0.0337 - lr: 0.000022
2023-10-13 09:20:43,306 DEV : loss 0.19683806598186493 - f1-score (micro avg) 0.8225
2023-10-13 09:20:43,312 ----------------------------------------------------------------------------------------------------
2023-10-13 09:20:44,141 epoch 7 - iter 15/152 - loss 0.02522725 - time (sec): 0.83 - samples/sec: 3723.79 - lr: 0.000022 - momentum: 0.000000
2023-10-13 09:20:45,015 epoch 7 - iter 30/152 - loss 0.03921157 - time (sec): 1.70 - samples/sec: 3597.43 - lr: 0.000021 - momentum: 0.000000
2023-10-13 09:20:45,845 epoch 7 - iter 45/152 - loss 0.03619964 - time (sec): 2.53 - samples/sec: 3587.92 - lr: 0.000021 - momentum: 0.000000
2023-10-13 09:20:46,774 epoch 7 - iter 60/152 - loss 0.03239290 - time (sec): 3.46 - samples/sec: 3618.93 - lr: 0.000020 - momentum: 0.000000
2023-10-13 09:20:47,563 epoch 7 - iter 75/152 - loss 0.02696597 - time (sec): 4.25 - samples/sec: 3642.10 - lr: 0.000020 - momentum: 0.000000
2023-10-13 09:20:48,386 epoch 7 - iter 90/152 - loss 0.02334264 - time (sec): 5.07 - samples/sec: 3672.24 - lr: 0.000019 - momentum: 0.000000
2023-10-13 09:20:49,217 epoch 7 - iter 105/152 - loss 0.02121594 - time (sec): 5.90 - samples/sec: 3682.08 - lr: 0.000019 - momentum: 0.000000
2023-10-13 09:20:50,087 epoch 7 - iter 120/152 - loss 0.02527719 - time (sec): 6.77 - samples/sec: 3704.72 - lr: 0.000018 - momentum: 0.000000
2023-10-13 09:20:50,909 epoch 7 - iter 135/152 - loss 0.02408376 - time (sec): 7.60 - samples/sec: 3662.26 - lr: 0.000017 - momentum: 0.000000
2023-10-13 09:20:51,714 epoch 7 - iter 150/152 - loss 0.02501798 - time (sec): 8.40 - samples/sec: 3656.11 - lr: 0.000017 - momentum: 0.000000
2023-10-13 09:20:51,813 ----------------------------------------------------------------------------------------------------
2023-10-13 09:20:51,813 EPOCH 7 done: loss 0.0248 - lr: 0.000017
2023-10-13 09:20:52,754 DEV : loss 0.2068709135055542 - f1-score (micro avg) 0.832
2023-10-13 09:20:52,760 saving best model
2023-10-13 09:20:53,216 ----------------------------------------------------------------------------------------------------
2023-10-13 09:20:54,166 epoch 8 - iter 15/152 - loss 0.01936172 - time (sec): 0.94 - samples/sec: 3574.90 - lr: 0.000016 - momentum: 0.000000
2023-10-13 09:20:55,014 epoch 8 - iter 30/152 - loss 0.01290260 - time (sec): 1.79 - samples/sec: 3524.69 - lr: 0.000016 - momentum: 0.000000
2023-10-13 09:20:55,863 epoch 8 - iter 45/152 - loss 0.01342831 - time (sec): 2.64 - samples/sec: 3560.86 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:20:56,679 epoch 8 - iter 60/152 - loss 0.01444531 - time (sec): 3.46 - samples/sec: 3594.77 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:20:57,551 epoch 8 - iter 75/152 - loss 0.01668268 - time (sec): 4.33 - samples/sec: 3606.62 - lr: 0.000014 - momentum: 0.000000
2023-10-13 09:20:58,398 epoch 8 - iter 90/152 - loss 0.01422119 - time (sec): 5.17 - samples/sec: 3617.50 - lr: 0.000014 - momentum: 0.000000
2023-10-13 09:20:59,208 epoch 8 - iter 105/152 - loss 0.01675058 - time (sec): 5.98 - samples/sec: 3621.42 - lr: 0.000013 - momentum: 0.000000
2023-10-13 09:21:00,059 epoch 8 - iter 120/152 - loss 0.01847952 - time (sec): 6.84 - samples/sec: 3620.37 - lr: 0.000012 - momentum: 0.000000
2023-10-13 09:21:00,883 epoch 8 - iter 135/152 - loss 0.01717958 - time (sec): 7.66 - samples/sec: 3612.05 - lr: 0.000012 - momentum: 0.000000
2023-10-13 09:21:01,703 epoch 8 - iter 150/152 - loss 0.01791064 - time (sec): 8.48 - samples/sec: 3608.11 - lr: 0.000011 - momentum: 0.000000
2023-10-13 09:21:01,817 ----------------------------------------------------------------------------------------------------
2023-10-13 09:21:01,817 EPOCH 8 done: loss 0.0177 - lr: 0.000011
2023-10-13 09:21:02,745 DEV : loss 0.2109656035900116 - f1-score (micro avg) 0.8306
2023-10-13 09:21:02,751 ----------------------------------------------------------------------------------------------------
2023-10-13 09:21:03,538 epoch 9 - iter 15/152 - loss 0.01841773 - time (sec): 0.79 - samples/sec: 3509.23 - lr: 0.000011 - momentum: 0.000000
2023-10-13 09:21:04,394 epoch 9 - iter 30/152 - loss 0.01084808 - time (sec): 1.64 - samples/sec: 3596.70 - lr: 0.000010 - momentum: 0.000000
2023-10-13 09:21:05,255 epoch 9 - iter 45/152 - loss 0.02176395 - time (sec): 2.50 - samples/sec: 3677.56 - lr: 0.000010 - momentum: 0.000000
2023-10-13 09:21:06,129 epoch 9 - iter 60/152 - loss 0.02005315 - time (sec): 3.38 - samples/sec: 3666.53 - lr: 0.000009 - momentum: 0.000000
2023-10-13 09:21:06,975 epoch 9 - iter 75/152 - loss 0.01624306 - time (sec): 4.22 - samples/sec: 3638.12 - lr: 0.000009 - momentum: 0.000000
2023-10-13 09:21:07,784 epoch 9 - iter 90/152 - loss 0.01758859 - time (sec): 5.03 - samples/sec: 3651.28 - lr: 0.000008 - momentum: 0.000000
2023-10-13 09:21:08,603 epoch 9 - iter 105/152 - loss 0.01621420 - time (sec): 5.85 - samples/sec: 3625.49 - lr: 0.000007 - momentum: 0.000000
2023-10-13 09:21:09,476 epoch 9 - iter 120/152 - loss 0.01529610 - time (sec): 6.72 - samples/sec: 3658.47 - lr: 0.000007 - momentum: 0.000000
2023-10-13 09:21:10,314 epoch 9 - iter 135/152 - loss 0.01571513 - time (sec): 7.56 - samples/sec: 3650.17 - lr: 0.000006 - momentum: 0.000000
2023-10-13 09:21:11,165 epoch 9 - iter 150/152 - loss 0.01446427 - time (sec): 8.41 - samples/sec: 3639.36 - lr: 0.000006 - momentum: 0.000000
2023-10-13 09:21:11,268 ----------------------------------------------------------------------------------------------------
2023-10-13 09:21:11,268 EPOCH 9 done: loss 0.0143 - lr: 0.000006
2023-10-13 09:21:12,220 DEV : loss 0.20851895213127136 - f1-score (micro avg) 0.8374
2023-10-13 09:21:12,225 saving best model
2023-10-13 09:21:12,701 ----------------------------------------------------------------------------------------------------
2023-10-13 09:21:13,528 epoch 10 - iter 15/152 - loss 0.00046529 - time (sec): 0.83 - samples/sec: 3534.52 - lr: 0.000005 - momentum: 0.000000
2023-10-13 09:21:14,417 epoch 10 - iter 30/152 - loss 0.00803632 - time (sec): 1.71 - samples/sec: 3579.25 - lr: 0.000005 - momentum: 0.000000
2023-10-13 09:21:15,272 epoch 10 - iter 45/152 - loss 0.01113557 - time (sec): 2.57 - samples/sec: 3699.47 - lr: 0.000004 - momentum: 0.000000
2023-10-13 09:21:16,112 epoch 10 - iter 60/152 - loss 0.01053112 - time (sec): 3.41 - samples/sec: 3671.54 - lr: 0.000004 - momentum: 0.000000
2023-10-13 09:21:16,929 epoch 10 - iter 75/152 - loss 0.01110488 - time (sec): 4.23 - samples/sec: 3633.56 - lr: 0.000003 - momentum: 0.000000
2023-10-13 09:21:17,845 epoch 10 - iter 90/152 - loss 0.01041796 - time (sec): 5.14 - samples/sec: 3633.82 - lr: 0.000003 - momentum: 0.000000
2023-10-13 09:21:18,659 epoch 10 - iter 105/152 - loss 0.01294686 - time (sec): 5.96 - samples/sec: 3658.62 - lr: 0.000002 - momentum: 0.000000
2023-10-13 09:21:19,479 epoch 10 - iter 120/152 - loss 0.01274279 - time (sec): 6.78 - samples/sec: 3630.54 - lr: 0.000001 - momentum: 0.000000
2023-10-13 09:21:20,361 epoch 10 - iter 135/152 - loss 0.01237044 - time (sec): 7.66 - samples/sec: 3617.20 - lr: 0.000001 - momentum: 0.000000
2023-10-13 09:21:21,180 epoch 10 - iter 150/152 - loss 0.01196728 - time (sec): 8.48 - samples/sec: 3631.35 - lr: 0.000000 - momentum: 0.000000
2023-10-13 09:21:21,276 ----------------------------------------------------------------------------------------------------
2023-10-13 09:21:21,276 EPOCH 10 done: loss 0.0119 - lr: 0.000000
2023-10-13 09:21:22,220 DEV : loss 0.20164324343204498 - f1-score (micro avg) 0.8474
2023-10-13 09:21:22,226 saving best model
2023-10-13 09:21:23,038 ----------------------------------------------------------------------------------------------------
2023-10-13 09:21:23,039 Loading model from best epoch ...
2023-10-13 09:21:24,377 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object
2023-10-13 09:21:25,207
Results:
- F-score (micro) 0.8076
- F-score (macro) 0.628
- Accuracy 0.6819
By class:
precision recall f1-score support
scope 0.7333 0.8013 0.7658 151
pers 0.8165 0.9271 0.8683 96
work 0.7818 0.9053 0.8390 95
loc 0.6667 0.6667 0.6667 3
date 0.0000 0.0000 0.0000 3
micro avg 0.7641 0.8563 0.8076 348
macro avg 0.5997 0.6601 0.6280 348
weighted avg 0.7626 0.8563 0.8066 348
2023-10-13 09:21:25,207 ----------------------------------------------------------------------------------------------------