stefan-it's picture
Upload folder using huggingface_hub
2f269e5
2023-10-08 19:09:42,730 ----------------------------------------------------------------------------------------------------
2023-10-08 19:09:42,731 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-08 19:09:42,731 ----------------------------------------------------------------------------------------------------
2023-10-08 19:09:42,731 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-08 19:09:42,731 ----------------------------------------------------------------------------------------------------
2023-10-08 19:09:42,731 Train: 966 sentences
2023-10-08 19:09:42,732 (train_with_dev=False, train_with_test=False)
2023-10-08 19:09:42,732 ----------------------------------------------------------------------------------------------------
2023-10-08 19:09:42,732 Training Params:
2023-10-08 19:09:42,732 - learning_rate: "0.00016"
2023-10-08 19:09:42,732 - mini_batch_size: "8"
2023-10-08 19:09:42,732 - max_epochs: "10"
2023-10-08 19:09:42,732 - shuffle: "True"
2023-10-08 19:09:42,732 ----------------------------------------------------------------------------------------------------
2023-10-08 19:09:42,732 Plugins:
2023-10-08 19:09:42,732 - TensorboardLogger
2023-10-08 19:09:42,732 - LinearScheduler | warmup_fraction: '0.1'
2023-10-08 19:09:42,732 ----------------------------------------------------------------------------------------------------
2023-10-08 19:09:42,732 Final evaluation on model from best epoch (best-model.pt)
2023-10-08 19:09:42,732 - metric: "('micro avg', 'f1-score')"
2023-10-08 19:09:42,732 ----------------------------------------------------------------------------------------------------
2023-10-08 19:09:42,732 Computation:
2023-10-08 19:09:42,732 - compute on device: cuda:0
2023-10-08 19:09:42,732 - embedding storage: none
2023-10-08 19:09:42,732 ----------------------------------------------------------------------------------------------------
2023-10-08 19:09:42,733 Model training base path: "hmbench-ajmc/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-1"
2023-10-08 19:09:42,733 ----------------------------------------------------------------------------------------------------
2023-10-08 19:09:42,733 ----------------------------------------------------------------------------------------------------
2023-10-08 19:09:42,733 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-08 19:09:51,412 epoch 1 - iter 12/121 - loss 3.23508884 - time (sec): 8.68 - samples/sec: 252.23 - lr: 0.000015 - momentum: 0.000000
2023-10-08 19:10:00,732 epoch 1 - iter 24/121 - loss 3.22835105 - time (sec): 18.00 - samples/sec: 258.08 - lr: 0.000030 - momentum: 0.000000
2023-10-08 19:10:09,664 epoch 1 - iter 36/121 - loss 3.21768152 - time (sec): 26.93 - samples/sec: 257.37 - lr: 0.000046 - momentum: 0.000000
2023-10-08 19:10:18,944 epoch 1 - iter 48/121 - loss 3.19823548 - time (sec): 36.21 - samples/sec: 260.48 - lr: 0.000062 - momentum: 0.000000
2023-10-08 19:10:28,319 epoch 1 - iter 60/121 - loss 3.16273646 - time (sec): 45.59 - samples/sec: 258.31 - lr: 0.000078 - momentum: 0.000000
2023-10-08 19:10:38,333 epoch 1 - iter 72/121 - loss 3.10021571 - time (sec): 55.60 - samples/sec: 259.25 - lr: 0.000094 - momentum: 0.000000
2023-10-08 19:10:47,328 epoch 1 - iter 84/121 - loss 3.03374434 - time (sec): 64.59 - samples/sec: 259.14 - lr: 0.000110 - momentum: 0.000000
2023-10-08 19:10:56,237 epoch 1 - iter 96/121 - loss 2.95823866 - time (sec): 73.50 - samples/sec: 258.95 - lr: 0.000126 - momentum: 0.000000
2023-10-08 19:11:06,174 epoch 1 - iter 108/121 - loss 2.86311152 - time (sec): 83.44 - samples/sec: 260.50 - lr: 0.000141 - momentum: 0.000000
2023-10-08 19:11:16,351 epoch 1 - iter 120/121 - loss 2.76149637 - time (sec): 93.62 - samples/sec: 261.84 - lr: 0.000157 - momentum: 0.000000
2023-10-08 19:11:17,066 ----------------------------------------------------------------------------------------------------
2023-10-08 19:11:17,066 EPOCH 1 done: loss 2.7536 - lr: 0.000157
2023-10-08 19:11:23,428 DEV : loss 1.8026584386825562 - f1-score (micro avg) 0.0
2023-10-08 19:11:23,434 ----------------------------------------------------------------------------------------------------
2023-10-08 19:11:32,523 epoch 2 - iter 12/121 - loss 1.80179870 - time (sec): 9.09 - samples/sec: 249.91 - lr: 0.000158 - momentum: 0.000000
2023-10-08 19:11:41,441 epoch 2 - iter 24/121 - loss 1.68078519 - time (sec): 18.01 - samples/sec: 259.09 - lr: 0.000157 - momentum: 0.000000
2023-10-08 19:11:50,070 epoch 2 - iter 36/121 - loss 1.57421188 - time (sec): 26.63 - samples/sec: 267.21 - lr: 0.000155 - momentum: 0.000000
2023-10-08 19:11:58,910 epoch 2 - iter 48/121 - loss 1.47404593 - time (sec): 35.47 - samples/sec: 271.49 - lr: 0.000153 - momentum: 0.000000
2023-10-08 19:12:07,552 epoch 2 - iter 60/121 - loss 1.38236308 - time (sec): 44.12 - samples/sec: 272.05 - lr: 0.000151 - momentum: 0.000000
2023-10-08 19:12:16,691 epoch 2 - iter 72/121 - loss 1.28818548 - time (sec): 53.26 - samples/sec: 276.22 - lr: 0.000150 - momentum: 0.000000
2023-10-08 19:12:26,207 epoch 2 - iter 84/121 - loss 1.21046557 - time (sec): 62.77 - samples/sec: 276.07 - lr: 0.000148 - momentum: 0.000000
2023-10-08 19:12:34,679 epoch 2 - iter 96/121 - loss 1.14389913 - time (sec): 71.24 - samples/sec: 276.52 - lr: 0.000146 - momentum: 0.000000
2023-10-08 19:12:43,013 epoch 2 - iter 108/121 - loss 1.09048980 - time (sec): 79.58 - samples/sec: 276.35 - lr: 0.000144 - momentum: 0.000000
2023-10-08 19:12:52,054 epoch 2 - iter 120/121 - loss 1.03589235 - time (sec): 88.62 - samples/sec: 277.84 - lr: 0.000143 - momentum: 0.000000
2023-10-08 19:12:52,538 ----------------------------------------------------------------------------------------------------
2023-10-08 19:12:52,539 EPOCH 2 done: loss 1.0336 - lr: 0.000143
2023-10-08 19:12:58,383 DEV : loss 0.6477599143981934 - f1-score (micro avg) 0.0
2023-10-08 19:12:58,389 ----------------------------------------------------------------------------------------------------
2023-10-08 19:13:07,354 epoch 3 - iter 12/121 - loss 0.60814729 - time (sec): 8.96 - samples/sec: 288.62 - lr: 0.000141 - momentum: 0.000000
2023-10-08 19:13:15,874 epoch 3 - iter 24/121 - loss 0.63644310 - time (sec): 17.48 - samples/sec: 288.96 - lr: 0.000139 - momentum: 0.000000
2023-10-08 19:13:25,124 epoch 3 - iter 36/121 - loss 0.60243801 - time (sec): 26.73 - samples/sec: 289.64 - lr: 0.000137 - momentum: 0.000000
2023-10-08 19:13:33,664 epoch 3 - iter 48/121 - loss 0.57783549 - time (sec): 35.27 - samples/sec: 285.60 - lr: 0.000135 - momentum: 0.000000
2023-10-08 19:13:41,996 epoch 3 - iter 60/121 - loss 0.54716035 - time (sec): 43.61 - samples/sec: 283.54 - lr: 0.000134 - momentum: 0.000000
2023-10-08 19:13:50,259 epoch 3 - iter 72/121 - loss 0.52880767 - time (sec): 51.87 - samples/sec: 282.54 - lr: 0.000132 - momentum: 0.000000
2023-10-08 19:13:59,460 epoch 3 - iter 84/121 - loss 0.51426265 - time (sec): 61.07 - samples/sec: 283.27 - lr: 0.000130 - momentum: 0.000000
2023-10-08 19:14:08,520 epoch 3 - iter 96/121 - loss 0.49778989 - time (sec): 70.13 - samples/sec: 284.18 - lr: 0.000128 - momentum: 0.000000
2023-10-08 19:14:16,806 epoch 3 - iter 108/121 - loss 0.48087024 - time (sec): 78.42 - samples/sec: 282.94 - lr: 0.000127 - momentum: 0.000000
2023-10-08 19:14:25,476 epoch 3 - iter 120/121 - loss 0.46915039 - time (sec): 87.09 - samples/sec: 282.40 - lr: 0.000125 - momentum: 0.000000
2023-10-08 19:14:26,025 ----------------------------------------------------------------------------------------------------
2023-10-08 19:14:26,026 EPOCH 3 done: loss 0.4677 - lr: 0.000125
2023-10-08 19:14:31,906 DEV : loss 0.36093592643737793 - f1-score (micro avg) 0.1892
2023-10-08 19:14:31,912 saving best model
2023-10-08 19:14:32,761 ----------------------------------------------------------------------------------------------------
2023-10-08 19:14:40,685 epoch 4 - iter 12/121 - loss 0.40042720 - time (sec): 7.92 - samples/sec: 277.17 - lr: 0.000123 - momentum: 0.000000
2023-10-08 19:14:49,980 epoch 4 - iter 24/121 - loss 0.37221502 - time (sec): 17.22 - samples/sec: 281.15 - lr: 0.000121 - momentum: 0.000000
2023-10-08 19:14:58,837 epoch 4 - iter 36/121 - loss 0.34677446 - time (sec): 26.07 - samples/sec: 278.85 - lr: 0.000120 - momentum: 0.000000
2023-10-08 19:15:07,396 epoch 4 - iter 48/121 - loss 0.33349337 - time (sec): 34.63 - samples/sec: 279.55 - lr: 0.000118 - momentum: 0.000000
2023-10-08 19:15:15,778 epoch 4 - iter 60/121 - loss 0.32999512 - time (sec): 43.02 - samples/sec: 279.64 - lr: 0.000116 - momentum: 0.000000
2023-10-08 19:15:25,418 epoch 4 - iter 72/121 - loss 0.31739942 - time (sec): 52.66 - samples/sec: 281.34 - lr: 0.000114 - momentum: 0.000000
2023-10-08 19:15:34,551 epoch 4 - iter 84/121 - loss 0.30532809 - time (sec): 61.79 - samples/sec: 281.49 - lr: 0.000113 - momentum: 0.000000
2023-10-08 19:15:43,239 epoch 4 - iter 96/121 - loss 0.30401618 - time (sec): 70.48 - samples/sec: 279.54 - lr: 0.000111 - momentum: 0.000000
2023-10-08 19:15:52,225 epoch 4 - iter 108/121 - loss 0.30419680 - time (sec): 79.46 - samples/sec: 278.79 - lr: 0.000109 - momentum: 0.000000
2023-10-08 19:16:01,079 epoch 4 - iter 120/121 - loss 0.29604969 - time (sec): 88.32 - samples/sec: 277.86 - lr: 0.000107 - momentum: 0.000000
2023-10-08 19:16:01,730 ----------------------------------------------------------------------------------------------------
2023-10-08 19:16:01,730 EPOCH 4 done: loss 0.2947 - lr: 0.000107
2023-10-08 19:16:07,857 DEV : loss 0.2683485746383667 - f1-score (micro avg) 0.5012
2023-10-08 19:16:07,863 saving best model
2023-10-08 19:16:12,260 ----------------------------------------------------------------------------------------------------
2023-10-08 19:16:21,272 epoch 5 - iter 12/121 - loss 0.29776193 - time (sec): 9.01 - samples/sec: 281.02 - lr: 0.000105 - momentum: 0.000000
2023-10-08 19:16:30,455 epoch 5 - iter 24/121 - loss 0.25433897 - time (sec): 18.19 - samples/sec: 270.81 - lr: 0.000104 - momentum: 0.000000
2023-10-08 19:16:40,176 epoch 5 - iter 36/121 - loss 0.24387713 - time (sec): 27.91 - samples/sec: 276.13 - lr: 0.000102 - momentum: 0.000000
2023-10-08 19:16:49,852 epoch 5 - iter 48/121 - loss 0.23814017 - time (sec): 37.59 - samples/sec: 276.64 - lr: 0.000100 - momentum: 0.000000
2023-10-08 19:16:59,242 epoch 5 - iter 60/121 - loss 0.23460200 - time (sec): 46.98 - samples/sec: 273.58 - lr: 0.000098 - momentum: 0.000000
2023-10-08 19:17:08,114 epoch 5 - iter 72/121 - loss 0.23228791 - time (sec): 55.85 - samples/sec: 270.64 - lr: 0.000097 - momentum: 0.000000
2023-10-08 19:17:17,452 epoch 5 - iter 84/121 - loss 0.22230739 - time (sec): 65.19 - samples/sec: 270.87 - lr: 0.000095 - momentum: 0.000000
2023-10-08 19:17:25,640 epoch 5 - iter 96/121 - loss 0.22054803 - time (sec): 73.38 - samples/sec: 270.20 - lr: 0.000093 - momentum: 0.000000
2023-10-08 19:17:34,248 epoch 5 - iter 108/121 - loss 0.22183902 - time (sec): 81.99 - samples/sec: 270.44 - lr: 0.000091 - momentum: 0.000000
2023-10-08 19:17:42,810 epoch 5 - iter 120/121 - loss 0.21744106 - time (sec): 90.55 - samples/sec: 270.98 - lr: 0.000090 - momentum: 0.000000
2023-10-08 19:17:43,499 ----------------------------------------------------------------------------------------------------
2023-10-08 19:17:43,499 EPOCH 5 done: loss 0.2170 - lr: 0.000090
2023-10-08 19:17:49,444 DEV : loss 0.21106016635894775 - f1-score (micro avg) 0.553
2023-10-08 19:17:49,450 saving best model
2023-10-08 19:17:53,836 ----------------------------------------------------------------------------------------------------
2023-10-08 19:18:02,159 epoch 6 - iter 12/121 - loss 0.18386118 - time (sec): 8.32 - samples/sec: 270.14 - lr: 0.000088 - momentum: 0.000000
2023-10-08 19:18:11,032 epoch 6 - iter 24/121 - loss 0.18168602 - time (sec): 17.19 - samples/sec: 278.81 - lr: 0.000086 - momentum: 0.000000
2023-10-08 19:18:19,555 epoch 6 - iter 36/121 - loss 0.17230306 - time (sec): 25.72 - samples/sec: 275.96 - lr: 0.000084 - momentum: 0.000000
2023-10-08 19:18:28,399 epoch 6 - iter 48/121 - loss 0.17026892 - time (sec): 34.56 - samples/sec: 278.72 - lr: 0.000082 - momentum: 0.000000
2023-10-08 19:18:37,174 epoch 6 - iter 60/121 - loss 0.16788127 - time (sec): 43.34 - samples/sec: 280.92 - lr: 0.000081 - momentum: 0.000000
2023-10-08 19:18:46,230 epoch 6 - iter 72/121 - loss 0.16544046 - time (sec): 52.39 - samples/sec: 284.06 - lr: 0.000079 - momentum: 0.000000
2023-10-08 19:18:54,833 epoch 6 - iter 84/121 - loss 0.17081982 - time (sec): 61.00 - samples/sec: 286.40 - lr: 0.000077 - momentum: 0.000000
2023-10-08 19:19:03,180 epoch 6 - iter 96/121 - loss 0.16853557 - time (sec): 69.34 - samples/sec: 284.31 - lr: 0.000075 - momentum: 0.000000
2023-10-08 19:19:11,336 epoch 6 - iter 108/121 - loss 0.16523181 - time (sec): 77.50 - samples/sec: 283.35 - lr: 0.000074 - momentum: 0.000000
2023-10-08 19:19:20,440 epoch 6 - iter 120/121 - loss 0.16613938 - time (sec): 86.60 - samples/sec: 282.97 - lr: 0.000072 - momentum: 0.000000
2023-10-08 19:19:21,215 ----------------------------------------------------------------------------------------------------
2023-10-08 19:19:21,215 EPOCH 6 done: loss 0.1661 - lr: 0.000072
2023-10-08 19:19:27,054 DEV : loss 0.1760144680738449 - f1-score (micro avg) 0.7806
2023-10-08 19:19:27,060 saving best model
2023-10-08 19:19:31,418 ----------------------------------------------------------------------------------------------------
2023-10-08 19:19:38,871 epoch 7 - iter 12/121 - loss 0.10501066 - time (sec): 7.45 - samples/sec: 262.75 - lr: 0.000070 - momentum: 0.000000
2023-10-08 19:19:47,687 epoch 7 - iter 24/121 - loss 0.12549822 - time (sec): 16.27 - samples/sec: 280.06 - lr: 0.000068 - momentum: 0.000000
2023-10-08 19:19:56,335 epoch 7 - iter 36/121 - loss 0.13201670 - time (sec): 24.92 - samples/sec: 280.19 - lr: 0.000066 - momentum: 0.000000
2023-10-08 19:20:05,342 epoch 7 - iter 48/121 - loss 0.13321851 - time (sec): 33.92 - samples/sec: 282.79 - lr: 0.000065 - momentum: 0.000000
2023-10-08 19:20:14,166 epoch 7 - iter 60/121 - loss 0.13144974 - time (sec): 42.75 - samples/sec: 282.32 - lr: 0.000063 - momentum: 0.000000
2023-10-08 19:20:22,125 epoch 7 - iter 72/121 - loss 0.12821297 - time (sec): 50.71 - samples/sec: 279.32 - lr: 0.000061 - momentum: 0.000000
2023-10-08 19:20:30,926 epoch 7 - iter 84/121 - loss 0.12881751 - time (sec): 59.51 - samples/sec: 279.48 - lr: 0.000059 - momentum: 0.000000
2023-10-08 19:20:40,120 epoch 7 - iter 96/121 - loss 0.13271001 - time (sec): 68.70 - samples/sec: 279.24 - lr: 0.000058 - momentum: 0.000000
2023-10-08 19:20:49,707 epoch 7 - iter 108/121 - loss 0.13549371 - time (sec): 78.29 - samples/sec: 279.88 - lr: 0.000056 - momentum: 0.000000
2023-10-08 19:20:59,082 epoch 7 - iter 120/121 - loss 0.13227314 - time (sec): 87.66 - samples/sec: 280.45 - lr: 0.000054 - momentum: 0.000000
2023-10-08 19:20:59,619 ----------------------------------------------------------------------------------------------------
2023-10-08 19:20:59,619 EPOCH 7 done: loss 0.1320 - lr: 0.000054
2023-10-08 19:21:05,606 DEV : loss 0.15729516744613647 - f1-score (micro avg) 0.798
2023-10-08 19:21:05,612 saving best model
2023-10-08 19:21:10,027 ----------------------------------------------------------------------------------------------------
2023-10-08 19:21:18,202 epoch 8 - iter 12/121 - loss 0.08777179 - time (sec): 8.17 - samples/sec: 257.14 - lr: 0.000052 - momentum: 0.000000
2023-10-08 19:21:27,132 epoch 8 - iter 24/121 - loss 0.11799951 - time (sec): 17.10 - samples/sec: 271.82 - lr: 0.000051 - momentum: 0.000000
2023-10-08 19:21:36,364 epoch 8 - iter 36/121 - loss 0.11559384 - time (sec): 26.34 - samples/sec: 278.02 - lr: 0.000049 - momentum: 0.000000
2023-10-08 19:21:45,372 epoch 8 - iter 48/121 - loss 0.10958313 - time (sec): 35.34 - samples/sec: 276.68 - lr: 0.000047 - momentum: 0.000000
2023-10-08 19:21:54,675 epoch 8 - iter 60/121 - loss 0.11471928 - time (sec): 44.65 - samples/sec: 275.99 - lr: 0.000045 - momentum: 0.000000
2023-10-08 19:22:03,859 epoch 8 - iter 72/121 - loss 0.11562885 - time (sec): 53.83 - samples/sec: 276.16 - lr: 0.000044 - momentum: 0.000000
2023-10-08 19:22:13,302 epoch 8 - iter 84/121 - loss 0.12004405 - time (sec): 63.27 - samples/sec: 276.54 - lr: 0.000042 - momentum: 0.000000
2023-10-08 19:22:22,457 epoch 8 - iter 96/121 - loss 0.11599093 - time (sec): 72.43 - samples/sec: 275.44 - lr: 0.000040 - momentum: 0.000000
2023-10-08 19:22:31,515 epoch 8 - iter 108/121 - loss 0.11215329 - time (sec): 81.49 - samples/sec: 273.32 - lr: 0.000038 - momentum: 0.000000
2023-10-08 19:22:40,474 epoch 8 - iter 120/121 - loss 0.11354143 - time (sec): 90.45 - samples/sec: 272.27 - lr: 0.000037 - momentum: 0.000000
2023-10-08 19:22:40,942 ----------------------------------------------------------------------------------------------------
2023-10-08 19:22:40,943 EPOCH 8 done: loss 0.1134 - lr: 0.000037
2023-10-08 19:22:47,364 DEV : loss 0.1546175330877304 - f1-score (micro avg) 0.8005
2023-10-08 19:22:47,370 saving best model
2023-10-08 19:22:48,281 ----------------------------------------------------------------------------------------------------
2023-10-08 19:22:56,895 epoch 9 - iter 12/121 - loss 0.13056422 - time (sec): 8.61 - samples/sec: 259.15 - lr: 0.000035 - momentum: 0.000000
2023-10-08 19:23:05,611 epoch 9 - iter 24/121 - loss 0.11272636 - time (sec): 17.33 - samples/sec: 253.79 - lr: 0.000033 - momentum: 0.000000
2023-10-08 19:23:15,150 epoch 9 - iter 36/121 - loss 0.10974938 - time (sec): 26.87 - samples/sec: 257.03 - lr: 0.000031 - momentum: 0.000000
2023-10-08 19:23:24,506 epoch 9 - iter 48/121 - loss 0.11417265 - time (sec): 36.22 - samples/sec: 260.91 - lr: 0.000029 - momentum: 0.000000
2023-10-08 19:23:34,345 epoch 9 - iter 60/121 - loss 0.11168973 - time (sec): 46.06 - samples/sec: 263.12 - lr: 0.000028 - momentum: 0.000000
2023-10-08 19:23:44,305 epoch 9 - iter 72/121 - loss 0.11000208 - time (sec): 56.02 - samples/sec: 262.20 - lr: 0.000026 - momentum: 0.000000
2023-10-08 19:23:53,834 epoch 9 - iter 84/121 - loss 0.10576423 - time (sec): 65.55 - samples/sec: 262.24 - lr: 0.000024 - momentum: 0.000000
2023-10-08 19:24:03,119 epoch 9 - iter 96/121 - loss 0.10372434 - time (sec): 74.84 - samples/sec: 261.61 - lr: 0.000022 - momentum: 0.000000
2023-10-08 19:24:12,544 epoch 9 - iter 108/121 - loss 0.10306218 - time (sec): 84.26 - samples/sec: 261.39 - lr: 0.000021 - momentum: 0.000000
2023-10-08 19:24:22,164 epoch 9 - iter 120/121 - loss 0.09946264 - time (sec): 93.88 - samples/sec: 261.55 - lr: 0.000019 - momentum: 0.000000
2023-10-08 19:24:22,809 ----------------------------------------------------------------------------------------------------
2023-10-08 19:24:22,810 EPOCH 9 done: loss 0.0999 - lr: 0.000019
2023-10-08 19:24:29,562 DEV : loss 0.1511085480451584 - f1-score (micro avg) 0.7955
2023-10-08 19:24:29,568 ----------------------------------------------------------------------------------------------------
2023-10-08 19:24:39,074 epoch 10 - iter 12/121 - loss 0.08760911 - time (sec): 9.50 - samples/sec: 257.98 - lr: 0.000017 - momentum: 0.000000
2023-10-08 19:24:48,911 epoch 10 - iter 24/121 - loss 0.08754082 - time (sec): 19.34 - samples/sec: 261.98 - lr: 0.000015 - momentum: 0.000000
2023-10-08 19:24:58,309 epoch 10 - iter 36/121 - loss 0.08538821 - time (sec): 28.74 - samples/sec: 261.97 - lr: 0.000013 - momentum: 0.000000
2023-10-08 19:25:07,884 epoch 10 - iter 48/121 - loss 0.09277258 - time (sec): 38.32 - samples/sec: 263.60 - lr: 0.000012 - momentum: 0.000000
2023-10-08 19:25:16,446 epoch 10 - iter 60/121 - loss 0.09376592 - time (sec): 46.88 - samples/sec: 260.69 - lr: 0.000010 - momentum: 0.000000
2023-10-08 19:25:25,474 epoch 10 - iter 72/121 - loss 0.09247047 - time (sec): 55.90 - samples/sec: 259.57 - lr: 0.000008 - momentum: 0.000000
2023-10-08 19:25:34,831 epoch 10 - iter 84/121 - loss 0.09186989 - time (sec): 65.26 - samples/sec: 259.83 - lr: 0.000006 - momentum: 0.000000
2023-10-08 19:25:44,399 epoch 10 - iter 96/121 - loss 0.09164437 - time (sec): 74.83 - samples/sec: 260.31 - lr: 0.000005 - momentum: 0.000000
2023-10-08 19:25:53,952 epoch 10 - iter 108/121 - loss 0.08998798 - time (sec): 84.38 - samples/sec: 260.39 - lr: 0.000003 - momentum: 0.000000
2023-10-08 19:26:03,631 epoch 10 - iter 120/121 - loss 0.09293965 - time (sec): 94.06 - samples/sec: 261.29 - lr: 0.000001 - momentum: 0.000000
2023-10-08 19:26:04,327 ----------------------------------------------------------------------------------------------------
2023-10-08 19:26:04,328 EPOCH 10 done: loss 0.0930 - lr: 0.000001
2023-10-08 19:26:10,907 DEV : loss 0.15049995481967926 - f1-score (micro avg) 0.801
2023-10-08 19:26:10,913 saving best model
2023-10-08 19:26:16,181 ----------------------------------------------------------------------------------------------------
2023-10-08 19:26:16,183 Loading model from best epoch ...
2023-10-08 19:26:19,227 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-08 19:26:25,801
Results:
- F-score (micro) 0.794
- F-score (macro) 0.4775
- Accuracy 0.696
By class:
precision recall f1-score support
pers 0.7973 0.8489 0.8223 139
scope 0.7986 0.8915 0.8425 129
work 0.6977 0.7500 0.7229 80
loc 0.0000 0.0000 0.0000 9
date 0.0000 0.0000 0.0000 3
micro avg 0.7751 0.8139 0.7940 360
macro avg 0.4587 0.4981 0.4775 360
weighted avg 0.7491 0.8139 0.7800 360
2023-10-08 19:26:25,801 ----------------------------------------------------------------------------------------------------