|
2023-10-06 23:17:42,929 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:17:42,930 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): T5LayerNorm() |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): T5LayerNorm() |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): T5LayerNorm() |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): T5LayerNorm() |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): T5LayerNorm() |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=25, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-06 23:17:42,930 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:17:42,930 MultiCorpus: 1100 train + 206 dev + 240 test sentences |
|
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator |
|
2023-10-06 23:17:42,930 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:17:42,931 Train: 1100 sentences |
|
2023-10-06 23:17:42,931 (train_with_dev=False, train_with_test=False) |
|
2023-10-06 23:17:42,931 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:17:42,931 Training Params: |
|
2023-10-06 23:17:42,931 - learning_rate: "0.00015" |
|
2023-10-06 23:17:42,931 - mini_batch_size: "4" |
|
2023-10-06 23:17:42,931 - max_epochs: "10" |
|
2023-10-06 23:17:42,931 - shuffle: "True" |
|
2023-10-06 23:17:42,931 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:17:42,931 Plugins: |
|
2023-10-06 23:17:42,931 - TensorboardLogger |
|
2023-10-06 23:17:42,931 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-06 23:17:42,931 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:17:42,931 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-06 23:17:42,931 - metric: "('micro avg', 'f1-score')" |
|
2023-10-06 23:17:42,931 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:17:42,931 Computation: |
|
2023-10-06 23:17:42,931 - compute on device: cuda:0 |
|
2023-10-06 23:17:42,931 - embedding storage: none |
|
2023-10-06 23:17:42,932 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:17:42,932 Model training base path: "hmbench-ajmc/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-06 23:17:42,932 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:17:42,932 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:17:42,932 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-06 23:17:53,081 epoch 1 - iter 27/275 - loss 3.20779212 - time (sec): 10.15 - samples/sec: 215.71 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-06 23:18:04,015 epoch 1 - iter 54/275 - loss 3.19893422 - time (sec): 21.08 - samples/sec: 212.08 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-06 23:18:15,565 epoch 1 - iter 81/275 - loss 3.18017401 - time (sec): 32.63 - samples/sec: 213.13 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-06 23:18:26,626 epoch 1 - iter 108/275 - loss 3.14294964 - time (sec): 43.69 - samples/sec: 210.67 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-06 23:18:38,029 epoch 1 - iter 135/275 - loss 3.06861851 - time (sec): 55.10 - samples/sec: 210.27 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-06 23:18:48,668 epoch 1 - iter 162/275 - loss 2.98386536 - time (sec): 65.74 - samples/sec: 208.75 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-06 23:18:59,398 epoch 1 - iter 189/275 - loss 2.87891269 - time (sec): 76.46 - samples/sec: 207.70 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-06 23:19:09,952 epoch 1 - iter 216/275 - loss 2.77298535 - time (sec): 87.02 - samples/sec: 207.98 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-06 23:19:20,467 epoch 1 - iter 243/275 - loss 2.66563305 - time (sec): 97.53 - samples/sec: 207.28 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-06 23:19:30,935 epoch 1 - iter 270/275 - loss 2.54633852 - time (sec): 108.00 - samples/sec: 206.16 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-06 23:19:33,191 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:19:33,192 EPOCH 1 done: loss 2.5171 - lr: 0.000147 |
|
2023-10-06 23:19:39,803 DEV : loss 1.1775470972061157 - f1-score (micro avg) 0.0 |
|
2023-10-06 23:19:39,809 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:19:50,771 epoch 2 - iter 27/275 - loss 1.03372971 - time (sec): 10.96 - samples/sec: 215.51 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-06 23:20:02,347 epoch 2 - iter 54/275 - loss 1.00951904 - time (sec): 22.54 - samples/sec: 210.82 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-06 23:20:13,223 epoch 2 - iter 81/275 - loss 0.91537880 - time (sec): 33.41 - samples/sec: 206.90 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-06 23:20:24,587 epoch 2 - iter 108/275 - loss 0.89903855 - time (sec): 44.78 - samples/sec: 208.39 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-06 23:20:35,651 epoch 2 - iter 135/275 - loss 0.85333467 - time (sec): 55.84 - samples/sec: 207.59 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-06 23:20:46,118 epoch 2 - iter 162/275 - loss 0.82046048 - time (sec): 66.31 - samples/sec: 206.46 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-06 23:20:57,032 epoch 2 - iter 189/275 - loss 0.80119432 - time (sec): 77.22 - samples/sec: 205.82 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-06 23:21:07,772 epoch 2 - iter 216/275 - loss 0.77353092 - time (sec): 87.96 - samples/sec: 206.14 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-06 23:21:17,964 epoch 2 - iter 243/275 - loss 0.74058223 - time (sec): 98.15 - samples/sec: 204.84 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-06 23:21:29,022 epoch 2 - iter 270/275 - loss 0.70448315 - time (sec): 109.21 - samples/sec: 204.95 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-06 23:21:30,900 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:21:30,900 EPOCH 2 done: loss 0.7014 - lr: 0.000134 |
|
2023-10-06 23:21:37,665 DEV : loss 0.42873144149780273 - f1-score (micro avg) 0.5582 |
|
2023-10-06 23:21:37,671 saving best model |
|
2023-10-06 23:21:38,639 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:21:49,785 epoch 3 - iter 27/275 - loss 0.43704065 - time (sec): 11.14 - samples/sec: 212.84 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-06 23:22:00,388 epoch 3 - iter 54/275 - loss 0.39009542 - time (sec): 21.75 - samples/sec: 208.48 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-06 23:22:11,535 epoch 3 - iter 81/275 - loss 0.37638191 - time (sec): 32.89 - samples/sec: 210.19 - lr: 0.000129 - momentum: 0.000000 |
|
2023-10-06 23:22:22,286 epoch 3 - iter 108/275 - loss 0.36351112 - time (sec): 43.65 - samples/sec: 208.11 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-06 23:22:33,756 epoch 3 - iter 135/275 - loss 0.34380491 - time (sec): 55.12 - samples/sec: 207.60 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-06 23:22:44,252 epoch 3 - iter 162/275 - loss 0.33317147 - time (sec): 65.61 - samples/sec: 206.76 - lr: 0.000124 - momentum: 0.000000 |
|
2023-10-06 23:22:55,521 epoch 3 - iter 189/275 - loss 0.32264027 - time (sec): 76.88 - samples/sec: 206.42 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-06 23:23:06,782 epoch 3 - iter 216/275 - loss 0.31955630 - time (sec): 88.14 - samples/sec: 207.28 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-06 23:23:16,907 epoch 3 - iter 243/275 - loss 0.31378673 - time (sec): 98.27 - samples/sec: 205.84 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-06 23:23:27,385 epoch 3 - iter 270/275 - loss 0.30770208 - time (sec): 108.75 - samples/sec: 206.31 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-06 23:23:29,177 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:23:29,177 EPOCH 3 done: loss 0.3081 - lr: 0.000117 |
|
2023-10-06 23:23:35,739 DEV : loss 0.21868450939655304 - f1-score (micro avg) 0.7855 |
|
2023-10-06 23:23:35,744 saving best model |
|
2023-10-06 23:23:36,679 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:23:47,415 epoch 4 - iter 27/275 - loss 0.23843334 - time (sec): 10.73 - samples/sec: 210.93 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-06 23:23:58,293 epoch 4 - iter 54/275 - loss 0.23321435 - time (sec): 21.61 - samples/sec: 213.77 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-06 23:24:09,026 epoch 4 - iter 81/275 - loss 0.22124299 - time (sec): 32.34 - samples/sec: 211.47 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-06 23:24:19,058 epoch 4 - iter 108/275 - loss 0.20355063 - time (sec): 42.38 - samples/sec: 206.93 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-06 23:24:30,407 epoch 4 - iter 135/275 - loss 0.19004237 - time (sec): 53.73 - samples/sec: 207.67 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-06 23:24:41,880 epoch 4 - iter 162/275 - loss 0.18252030 - time (sec): 65.20 - samples/sec: 209.48 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-06 23:24:52,348 epoch 4 - iter 189/275 - loss 0.17924747 - time (sec): 75.67 - samples/sec: 207.77 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-06 23:25:03,353 epoch 4 - iter 216/275 - loss 0.17416266 - time (sec): 86.67 - samples/sec: 207.01 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-06 23:25:13,778 epoch 4 - iter 243/275 - loss 0.16862196 - time (sec): 97.10 - samples/sec: 205.85 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-06 23:25:24,256 epoch 4 - iter 270/275 - loss 0.16310851 - time (sec): 107.58 - samples/sec: 206.76 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-06 23:25:26,700 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:25:26,700 EPOCH 4 done: loss 0.1624 - lr: 0.000101 |
|
2023-10-06 23:25:33,324 DEV : loss 0.1493232250213623 - f1-score (micro avg) 0.8382 |
|
2023-10-06 23:25:33,330 saving best model |
|
2023-10-06 23:25:34,248 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:25:45,066 epoch 5 - iter 27/275 - loss 0.08466083 - time (sec): 10.82 - samples/sec: 202.38 - lr: 0.000099 - momentum: 0.000000 |
|
2023-10-06 23:25:55,975 epoch 5 - iter 54/275 - loss 0.09065331 - time (sec): 21.72 - samples/sec: 207.69 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-06 23:26:06,526 epoch 5 - iter 81/275 - loss 0.09365691 - time (sec): 32.28 - samples/sec: 206.13 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-06 23:26:17,220 epoch 5 - iter 108/275 - loss 0.10027437 - time (sec): 42.97 - samples/sec: 204.70 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-06 23:26:27,592 epoch 5 - iter 135/275 - loss 0.10170985 - time (sec): 53.34 - samples/sec: 203.78 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-06 23:26:39,008 epoch 5 - iter 162/275 - loss 0.09929130 - time (sec): 64.76 - samples/sec: 203.88 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-06 23:26:49,894 epoch 5 - iter 189/275 - loss 0.09494369 - time (sec): 75.64 - samples/sec: 205.05 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-06 23:27:00,929 epoch 5 - iter 216/275 - loss 0.09307176 - time (sec): 86.68 - samples/sec: 204.98 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-06 23:27:11,558 epoch 5 - iter 243/275 - loss 0.10031013 - time (sec): 97.31 - samples/sec: 205.28 - lr: 0.000086 - momentum: 0.000000 |
|
2023-10-06 23:27:22,376 epoch 5 - iter 270/275 - loss 0.09762639 - time (sec): 108.13 - samples/sec: 205.60 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-06 23:27:24,773 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:27:24,774 EPOCH 5 done: loss 0.0987 - lr: 0.000084 |
|
2023-10-06 23:27:31,457 DEV : loss 0.12611618638038635 - f1-score (micro avg) 0.8616 |
|
2023-10-06 23:27:31,465 saving best model |
|
2023-10-06 23:27:32,384 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:27:43,387 epoch 6 - iter 27/275 - loss 0.07895610 - time (sec): 11.00 - samples/sec: 207.51 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-06 23:27:53,456 epoch 6 - iter 54/275 - loss 0.09273836 - time (sec): 21.07 - samples/sec: 204.64 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-06 23:28:03,998 epoch 6 - iter 81/275 - loss 0.08414141 - time (sec): 31.61 - samples/sec: 203.97 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-06 23:28:14,503 epoch 6 - iter 108/275 - loss 0.08316482 - time (sec): 42.12 - samples/sec: 203.43 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-06 23:28:25,564 epoch 6 - iter 135/275 - loss 0.07871307 - time (sec): 53.18 - samples/sec: 202.39 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-06 23:28:36,883 epoch 6 - iter 162/275 - loss 0.08138135 - time (sec): 64.50 - samples/sec: 204.35 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-06 23:28:47,438 epoch 6 - iter 189/275 - loss 0.08231468 - time (sec): 75.05 - samples/sec: 204.10 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-06 23:28:58,758 epoch 6 - iter 216/275 - loss 0.08734570 - time (sec): 86.37 - samples/sec: 204.81 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-06 23:29:09,868 epoch 6 - iter 243/275 - loss 0.08052102 - time (sec): 97.48 - samples/sec: 205.70 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-06 23:29:20,681 epoch 6 - iter 270/275 - loss 0.07520156 - time (sec): 108.30 - samples/sec: 206.50 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-06 23:29:22,697 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:29:22,697 EPOCH 6 done: loss 0.0743 - lr: 0.000067 |
|
2023-10-06 23:29:29,351 DEV : loss 0.12128882855176926 - f1-score (micro avg) 0.8758 |
|
2023-10-06 23:29:29,356 saving best model |
|
2023-10-06 23:29:30,274 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:29:40,216 epoch 7 - iter 27/275 - loss 0.05932625 - time (sec): 9.94 - samples/sec: 200.08 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-06 23:29:51,868 epoch 7 - iter 54/275 - loss 0.04672326 - time (sec): 21.59 - samples/sec: 205.76 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-06 23:30:02,697 epoch 7 - iter 81/275 - loss 0.04533032 - time (sec): 32.42 - samples/sec: 207.73 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-06 23:30:13,412 epoch 7 - iter 108/275 - loss 0.04353625 - time (sec): 43.14 - samples/sec: 208.50 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-06 23:30:24,374 epoch 7 - iter 135/275 - loss 0.05327309 - time (sec): 54.10 - samples/sec: 210.45 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-06 23:30:35,356 epoch 7 - iter 162/275 - loss 0.05492008 - time (sec): 65.08 - samples/sec: 210.63 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-06 23:30:45,948 epoch 7 - iter 189/275 - loss 0.05356602 - time (sec): 75.67 - samples/sec: 209.59 - lr: 0.000056 - momentum: 0.000000 |
|
2023-10-06 23:30:56,774 epoch 7 - iter 216/275 - loss 0.05638404 - time (sec): 86.50 - samples/sec: 209.12 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-06 23:31:07,501 epoch 7 - iter 243/275 - loss 0.05727416 - time (sec): 97.23 - samples/sec: 207.60 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-06 23:31:18,650 epoch 7 - iter 270/275 - loss 0.05734359 - time (sec): 108.37 - samples/sec: 207.38 - lr: 0.000051 - momentum: 0.000000 |
|
2023-10-06 23:31:20,274 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:31:20,274 EPOCH 7 done: loss 0.0575 - lr: 0.000051 |
|
2023-10-06 23:31:26,927 DEV : loss 0.12337013334035873 - f1-score (micro avg) 0.8783 |
|
2023-10-06 23:31:26,932 saving best model |
|
2023-10-06 23:31:28,029 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:31:38,496 epoch 8 - iter 27/275 - loss 0.06679142 - time (sec): 10.47 - samples/sec: 205.91 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-06 23:31:49,201 epoch 8 - iter 54/275 - loss 0.05175168 - time (sec): 21.17 - samples/sec: 206.19 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-06 23:31:59,409 epoch 8 - iter 81/275 - loss 0.04590241 - time (sec): 31.38 - samples/sec: 206.03 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-06 23:32:09,829 epoch 8 - iter 108/275 - loss 0.05025701 - time (sec): 41.80 - samples/sec: 206.68 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-06 23:32:20,652 epoch 8 - iter 135/275 - loss 0.04689769 - time (sec): 52.62 - samples/sec: 205.85 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-06 23:32:32,043 epoch 8 - iter 162/275 - loss 0.04687291 - time (sec): 64.01 - samples/sec: 207.51 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-06 23:32:43,660 epoch 8 - iter 189/275 - loss 0.04632883 - time (sec): 75.63 - samples/sec: 207.27 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-06 23:32:55,088 epoch 8 - iter 216/275 - loss 0.05000443 - time (sec): 87.06 - samples/sec: 207.66 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-06 23:33:05,581 epoch 8 - iter 243/275 - loss 0.04980880 - time (sec): 97.55 - samples/sec: 207.44 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-06 23:33:15,960 epoch 8 - iter 270/275 - loss 0.04839911 - time (sec): 107.93 - samples/sec: 207.10 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-06 23:33:18,029 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:33:18,030 EPOCH 8 done: loss 0.0477 - lr: 0.000034 |
|
2023-10-06 23:33:24,660 DEV : loss 0.127992644906044 - f1-score (micro avg) 0.873 |
|
2023-10-06 23:33:24,666 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:33:34,726 epoch 9 - iter 27/275 - loss 0.04178413 - time (sec): 10.06 - samples/sec: 199.43 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-06 23:33:45,547 epoch 9 - iter 54/275 - loss 0.05131075 - time (sec): 20.88 - samples/sec: 204.79 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-06 23:33:56,356 epoch 9 - iter 81/275 - loss 0.05860657 - time (sec): 31.69 - samples/sec: 206.35 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-06 23:34:06,475 epoch 9 - iter 108/275 - loss 0.05561689 - time (sec): 41.81 - samples/sec: 203.29 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-06 23:34:17,273 epoch 9 - iter 135/275 - loss 0.05086730 - time (sec): 52.61 - samples/sec: 203.17 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-06 23:34:28,378 epoch 9 - iter 162/275 - loss 0.04692919 - time (sec): 63.71 - samples/sec: 204.55 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-06 23:34:39,748 epoch 9 - iter 189/275 - loss 0.04599155 - time (sec): 75.08 - samples/sec: 206.27 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-06 23:34:51,078 epoch 9 - iter 216/275 - loss 0.04321166 - time (sec): 86.41 - samples/sec: 206.78 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-06 23:35:01,862 epoch 9 - iter 243/275 - loss 0.04328397 - time (sec): 97.19 - samples/sec: 206.47 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-06 23:35:12,650 epoch 9 - iter 270/275 - loss 0.04290629 - time (sec): 107.98 - samples/sec: 206.79 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-06 23:35:14,703 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:35:14,704 EPOCH 9 done: loss 0.0429 - lr: 0.000017 |
|
2023-10-06 23:35:21,542 DEV : loss 0.1252521574497223 - f1-score (micro avg) 0.883 |
|
2023-10-06 23:35:21,548 saving best model |
|
2023-10-06 23:35:22,470 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:35:32,902 epoch 10 - iter 27/275 - loss 0.03780507 - time (sec): 10.43 - samples/sec: 203.07 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-06 23:35:44,530 epoch 10 - iter 54/275 - loss 0.02927955 - time (sec): 22.06 - samples/sec: 203.91 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-06 23:35:54,510 epoch 10 - iter 81/275 - loss 0.04116419 - time (sec): 32.04 - samples/sec: 201.14 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-06 23:36:04,734 epoch 10 - iter 108/275 - loss 0.04094626 - time (sec): 42.26 - samples/sec: 200.70 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-06 23:36:16,023 epoch 10 - iter 135/275 - loss 0.04004258 - time (sec): 53.55 - samples/sec: 203.73 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-06 23:36:26,817 epoch 10 - iter 162/275 - loss 0.03973698 - time (sec): 64.35 - samples/sec: 204.54 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-06 23:36:37,043 epoch 10 - iter 189/275 - loss 0.04053412 - time (sec): 74.57 - samples/sec: 203.43 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-06 23:36:48,182 epoch 10 - iter 216/275 - loss 0.04054429 - time (sec): 85.71 - samples/sec: 204.55 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-06 23:36:59,116 epoch 10 - iter 243/275 - loss 0.04240430 - time (sec): 96.64 - samples/sec: 205.37 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-06 23:37:10,460 epoch 10 - iter 270/275 - loss 0.04062955 - time (sec): 107.99 - samples/sec: 206.36 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-06 23:37:12,554 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:37:12,555 EPOCH 10 done: loss 0.0405 - lr: 0.000001 |
|
2023-10-06 23:37:19,202 DEV : loss 0.1265440583229065 - f1-score (micro avg) 0.8788 |
|
2023-10-06 23:37:20,098 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 23:37:20,100 Loading model from best epoch ... |
|
2023-10-06 23:37:24,178 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date |
|
2023-10-06 23:37:31,313 |
|
Results: |
|
- F-score (micro) 0.9129 |
|
- F-score (macro) 0.5465 |
|
- Accuracy 0.8561 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
scope 0.9011 0.9318 0.9162 176 |
|
pers 0.9457 0.9531 0.9494 128 |
|
work 0.8553 0.8784 0.8667 74 |
|
object 0.0000 0.0000 0.0000 2 |
|
loc 0.0000 0.0000 0.0000 2 |
|
|
|
micro avg 0.9070 0.9188 0.9129 382 |
|
macro avg 0.5404 0.5527 0.5465 382 |
|
weighted avg 0.8977 0.9188 0.9081 382 |
|
|
|
2023-10-06 23:37:31,313 ---------------------------------------------------------------------------------------------------- |
|
|