stefan-it's picture
Upload folder using huggingface_hub
0ae42dd
2023-10-07 02:12:28,913 ----------------------------------------------------------------------------------------------------
2023-10-07 02:12:28,914 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-07 02:12:28,914 ----------------------------------------------------------------------------------------------------
2023-10-07 02:12:28,914 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-07 02:12:28,914 ----------------------------------------------------------------------------------------------------
2023-10-07 02:12:28,914 Train: 1100 sentences
2023-10-07 02:12:28,914 (train_with_dev=False, train_with_test=False)
2023-10-07 02:12:28,914 ----------------------------------------------------------------------------------------------------
2023-10-07 02:12:28,914 Training Params:
2023-10-07 02:12:28,914 - learning_rate: "0.00016"
2023-10-07 02:12:28,914 - mini_batch_size: "4"
2023-10-07 02:12:28,914 - max_epochs: "10"
2023-10-07 02:12:28,914 - shuffle: "True"
2023-10-07 02:12:28,914 ----------------------------------------------------------------------------------------------------
2023-10-07 02:12:28,914 Plugins:
2023-10-07 02:12:28,914 - TensorboardLogger
2023-10-07 02:12:28,915 - LinearScheduler | warmup_fraction: '0.1'
2023-10-07 02:12:28,915 ----------------------------------------------------------------------------------------------------
2023-10-07 02:12:28,915 Final evaluation on model from best epoch (best-model.pt)
2023-10-07 02:12:28,915 - metric: "('micro avg', 'f1-score')"
2023-10-07 02:12:28,915 ----------------------------------------------------------------------------------------------------
2023-10-07 02:12:28,915 Computation:
2023-10-07 02:12:28,915 - compute on device: cuda:0
2023-10-07 02:12:28,915 - embedding storage: none
2023-10-07 02:12:28,915 ----------------------------------------------------------------------------------------------------
2023-10-07 02:12:28,915 Model training base path: "hmbench-ajmc/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-5"
2023-10-07 02:12:28,915 ----------------------------------------------------------------------------------------------------
2023-10-07 02:12:28,915 ----------------------------------------------------------------------------------------------------
2023-10-07 02:12:28,915 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-07 02:12:39,998 epoch 1 - iter 27/275 - loss 3.23597451 - time (sec): 11.08 - samples/sec: 215.66 - lr: 0.000015 - momentum: 0.000000
2023-10-07 02:12:50,779 epoch 1 - iter 54/275 - loss 3.22462504 - time (sec): 21.86 - samples/sec: 217.59 - lr: 0.000031 - momentum: 0.000000
2023-10-07 02:13:01,461 epoch 1 - iter 81/275 - loss 3.20370274 - time (sec): 32.54 - samples/sec: 215.31 - lr: 0.000047 - momentum: 0.000000
2023-10-07 02:13:11,505 epoch 1 - iter 108/275 - loss 3.16206007 - time (sec): 42.59 - samples/sec: 213.67 - lr: 0.000062 - momentum: 0.000000
2023-10-07 02:13:22,015 epoch 1 - iter 135/275 - loss 3.08191132 - time (sec): 53.10 - samples/sec: 212.60 - lr: 0.000078 - momentum: 0.000000
2023-10-07 02:13:32,226 epoch 1 - iter 162/275 - loss 2.98632345 - time (sec): 63.31 - samples/sec: 212.12 - lr: 0.000094 - momentum: 0.000000
2023-10-07 02:13:42,861 epoch 1 - iter 189/275 - loss 2.87335549 - time (sec): 73.94 - samples/sec: 212.19 - lr: 0.000109 - momentum: 0.000000
2023-10-07 02:13:53,276 epoch 1 - iter 216/275 - loss 2.75590896 - time (sec): 84.36 - samples/sec: 211.51 - lr: 0.000125 - momentum: 0.000000
2023-10-07 02:14:03,691 epoch 1 - iter 243/275 - loss 2.63230218 - time (sec): 94.78 - samples/sec: 211.23 - lr: 0.000141 - momentum: 0.000000
2023-10-07 02:14:14,534 epoch 1 - iter 270/275 - loss 2.49511163 - time (sec): 105.62 - samples/sec: 211.72 - lr: 0.000157 - momentum: 0.000000
2023-10-07 02:14:16,472 ----------------------------------------------------------------------------------------------------
2023-10-07 02:14:16,472 EPOCH 1 done: loss 2.4684 - lr: 0.000157
2023-10-07 02:14:22,844 DEV : loss 1.106745719909668 - f1-score (micro avg) 0.0
2023-10-07 02:14:22,849 ----------------------------------------------------------------------------------------------------
2023-10-07 02:14:32,759 epoch 2 - iter 27/275 - loss 1.03794459 - time (sec): 9.91 - samples/sec: 201.13 - lr: 0.000158 - momentum: 0.000000
2023-10-07 02:14:43,440 epoch 2 - iter 54/275 - loss 0.93637007 - time (sec): 20.59 - samples/sec: 209.04 - lr: 0.000157 - momentum: 0.000000
2023-10-07 02:14:53,649 epoch 2 - iter 81/275 - loss 0.93217268 - time (sec): 30.80 - samples/sec: 210.53 - lr: 0.000155 - momentum: 0.000000
2023-10-07 02:15:04,743 epoch 2 - iter 108/275 - loss 0.89211071 - time (sec): 41.89 - samples/sec: 212.38 - lr: 0.000153 - momentum: 0.000000
2023-10-07 02:15:15,037 epoch 2 - iter 135/275 - loss 0.85263776 - time (sec): 52.19 - samples/sec: 210.25 - lr: 0.000151 - momentum: 0.000000
2023-10-07 02:15:26,454 epoch 2 - iter 162/275 - loss 0.78822686 - time (sec): 63.60 - samples/sec: 211.47 - lr: 0.000150 - momentum: 0.000000
2023-10-07 02:15:36,979 epoch 2 - iter 189/275 - loss 0.76715336 - time (sec): 74.13 - samples/sec: 212.09 - lr: 0.000148 - momentum: 0.000000
2023-10-07 02:15:47,831 epoch 2 - iter 216/275 - loss 0.73485356 - time (sec): 84.98 - samples/sec: 212.67 - lr: 0.000146 - momentum: 0.000000
2023-10-07 02:15:58,308 epoch 2 - iter 243/275 - loss 0.69914070 - time (sec): 95.46 - samples/sec: 212.73 - lr: 0.000144 - momentum: 0.000000
2023-10-07 02:16:08,417 epoch 2 - iter 270/275 - loss 0.67613944 - time (sec): 105.57 - samples/sec: 211.86 - lr: 0.000143 - momentum: 0.000000
2023-10-07 02:16:10,370 ----------------------------------------------------------------------------------------------------
2023-10-07 02:16:10,370 EPOCH 2 done: loss 0.6708 - lr: 0.000143
2023-10-07 02:16:16,959 DEV : loss 0.4049922525882721 - f1-score (micro avg) 0.5657
2023-10-07 02:16:16,964 saving best model
2023-10-07 02:16:17,905 ----------------------------------------------------------------------------------------------------
2023-10-07 02:16:27,808 epoch 3 - iter 27/275 - loss 0.36210755 - time (sec): 9.90 - samples/sec: 204.70 - lr: 0.000141 - momentum: 0.000000
2023-10-07 02:16:38,961 epoch 3 - iter 54/275 - loss 0.34685352 - time (sec): 21.06 - samples/sec: 212.25 - lr: 0.000139 - momentum: 0.000000
2023-10-07 02:16:48,704 epoch 3 - iter 81/275 - loss 0.33215168 - time (sec): 30.80 - samples/sec: 210.05 - lr: 0.000137 - momentum: 0.000000
2023-10-07 02:16:59,168 epoch 3 - iter 108/275 - loss 0.31409223 - time (sec): 41.26 - samples/sec: 209.97 - lr: 0.000135 - momentum: 0.000000
2023-10-07 02:17:10,897 epoch 3 - iter 135/275 - loss 0.31849588 - time (sec): 52.99 - samples/sec: 212.00 - lr: 0.000134 - momentum: 0.000000
2023-10-07 02:17:21,087 epoch 3 - iter 162/275 - loss 0.31293264 - time (sec): 63.18 - samples/sec: 211.31 - lr: 0.000132 - momentum: 0.000000
2023-10-07 02:17:31,541 epoch 3 - iter 189/275 - loss 0.31181772 - time (sec): 73.64 - samples/sec: 212.57 - lr: 0.000130 - momentum: 0.000000
2023-10-07 02:17:41,640 epoch 3 - iter 216/275 - loss 0.29993482 - time (sec): 83.73 - samples/sec: 211.98 - lr: 0.000128 - momentum: 0.000000
2023-10-07 02:17:53,115 epoch 3 - iter 243/275 - loss 0.28383867 - time (sec): 95.21 - samples/sec: 211.68 - lr: 0.000127 - momentum: 0.000000
2023-10-07 02:18:03,420 epoch 3 - iter 270/275 - loss 0.28319392 - time (sec): 105.51 - samples/sec: 211.11 - lr: 0.000125 - momentum: 0.000000
2023-10-07 02:18:05,667 ----------------------------------------------------------------------------------------------------
2023-10-07 02:18:05,667 EPOCH 3 done: loss 0.2820 - lr: 0.000125
2023-10-07 02:18:12,203 DEV : loss 0.2055114060640335 - f1-score (micro avg) 0.7879
2023-10-07 02:18:12,208 saving best model
2023-10-07 02:18:13,055 ----------------------------------------------------------------------------------------------------
2023-10-07 02:18:22,915 epoch 4 - iter 27/275 - loss 0.18218725 - time (sec): 9.86 - samples/sec: 204.60 - lr: 0.000123 - momentum: 0.000000
2023-10-07 02:18:33,400 epoch 4 - iter 54/275 - loss 0.17494835 - time (sec): 20.34 - samples/sec: 204.09 - lr: 0.000121 - momentum: 0.000000
2023-10-07 02:18:43,892 epoch 4 - iter 81/275 - loss 0.17221347 - time (sec): 30.83 - samples/sec: 205.84 - lr: 0.000119 - momentum: 0.000000
2023-10-07 02:18:54,265 epoch 4 - iter 108/275 - loss 0.16848851 - time (sec): 41.21 - samples/sec: 205.18 - lr: 0.000118 - momentum: 0.000000
2023-10-07 02:19:04,912 epoch 4 - iter 135/275 - loss 0.16270097 - time (sec): 51.85 - samples/sec: 206.52 - lr: 0.000116 - momentum: 0.000000
2023-10-07 02:19:16,228 epoch 4 - iter 162/275 - loss 0.15537070 - time (sec): 63.17 - samples/sec: 208.97 - lr: 0.000114 - momentum: 0.000000
2023-10-07 02:19:26,738 epoch 4 - iter 189/275 - loss 0.15649557 - time (sec): 73.68 - samples/sec: 209.17 - lr: 0.000112 - momentum: 0.000000
2023-10-07 02:19:37,435 epoch 4 - iter 216/275 - loss 0.14952139 - time (sec): 84.38 - samples/sec: 208.44 - lr: 0.000111 - momentum: 0.000000
2023-10-07 02:19:48,127 epoch 4 - iter 243/275 - loss 0.14597619 - time (sec): 95.07 - samples/sec: 208.83 - lr: 0.000109 - momentum: 0.000000
2023-10-07 02:19:59,178 epoch 4 - iter 270/275 - loss 0.14571486 - time (sec): 106.12 - samples/sec: 210.01 - lr: 0.000107 - momentum: 0.000000
2023-10-07 02:20:01,262 ----------------------------------------------------------------------------------------------------
2023-10-07 02:20:01,262 EPOCH 4 done: loss 0.1439 - lr: 0.000107
2023-10-07 02:20:07,795 DEV : loss 0.13657842576503754 - f1-score (micro avg) 0.852
2023-10-07 02:20:07,800 saving best model
2023-10-07 02:20:08,650 ----------------------------------------------------------------------------------------------------
2023-10-07 02:20:19,324 epoch 5 - iter 27/275 - loss 0.10735410 - time (sec): 10.67 - samples/sec: 217.29 - lr: 0.000105 - momentum: 0.000000
2023-10-07 02:20:30,041 epoch 5 - iter 54/275 - loss 0.09200243 - time (sec): 21.39 - samples/sec: 215.10 - lr: 0.000103 - momentum: 0.000000
2023-10-07 02:20:39,930 epoch 5 - iter 81/275 - loss 0.09209995 - time (sec): 31.28 - samples/sec: 211.07 - lr: 0.000102 - momentum: 0.000000
2023-10-07 02:20:51,698 epoch 5 - iter 108/275 - loss 0.09118834 - time (sec): 43.05 - samples/sec: 212.84 - lr: 0.000100 - momentum: 0.000000
2023-10-07 02:21:02,612 epoch 5 - iter 135/275 - loss 0.09336177 - time (sec): 53.96 - samples/sec: 212.73 - lr: 0.000098 - momentum: 0.000000
2023-10-07 02:21:12,944 epoch 5 - iter 162/275 - loss 0.09227941 - time (sec): 64.29 - samples/sec: 211.13 - lr: 0.000096 - momentum: 0.000000
2023-10-07 02:21:23,975 epoch 5 - iter 189/275 - loss 0.08857508 - time (sec): 75.32 - samples/sec: 211.03 - lr: 0.000095 - momentum: 0.000000
2023-10-07 02:21:34,493 epoch 5 - iter 216/275 - loss 0.08863026 - time (sec): 85.84 - samples/sec: 211.12 - lr: 0.000093 - momentum: 0.000000
2023-10-07 02:21:44,705 epoch 5 - iter 243/275 - loss 0.09010889 - time (sec): 96.05 - samples/sec: 209.69 - lr: 0.000091 - momentum: 0.000000
2023-10-07 02:21:55,180 epoch 5 - iter 270/275 - loss 0.08568815 - time (sec): 106.53 - samples/sec: 210.17 - lr: 0.000089 - momentum: 0.000000
2023-10-07 02:21:57,055 ----------------------------------------------------------------------------------------------------
2023-10-07 02:21:57,055 EPOCH 5 done: loss 0.0857 - lr: 0.000089
2023-10-07 02:22:03,604 DEV : loss 0.12589101493358612 - f1-score (micro avg) 0.8499
2023-10-07 02:22:03,610 ----------------------------------------------------------------------------------------------------
2023-10-07 02:22:14,143 epoch 6 - iter 27/275 - loss 0.05644948 - time (sec): 10.53 - samples/sec: 205.18 - lr: 0.000087 - momentum: 0.000000
2023-10-07 02:22:25,074 epoch 6 - iter 54/275 - loss 0.07001457 - time (sec): 21.46 - samples/sec: 211.75 - lr: 0.000086 - momentum: 0.000000
2023-10-07 02:22:35,448 epoch 6 - iter 81/275 - loss 0.06319042 - time (sec): 31.84 - samples/sec: 212.48 - lr: 0.000084 - momentum: 0.000000
2023-10-07 02:22:46,343 epoch 6 - iter 108/275 - loss 0.06152701 - time (sec): 42.73 - samples/sec: 211.57 - lr: 0.000082 - momentum: 0.000000
2023-10-07 02:22:56,285 epoch 6 - iter 135/275 - loss 0.06315256 - time (sec): 52.67 - samples/sec: 211.26 - lr: 0.000080 - momentum: 0.000000
2023-10-07 02:23:06,825 epoch 6 - iter 162/275 - loss 0.06048732 - time (sec): 63.21 - samples/sec: 210.47 - lr: 0.000079 - momentum: 0.000000
2023-10-07 02:23:18,249 epoch 6 - iter 189/275 - loss 0.05972327 - time (sec): 74.64 - samples/sec: 211.77 - lr: 0.000077 - momentum: 0.000000
2023-10-07 02:23:28,702 epoch 6 - iter 216/275 - loss 0.06084496 - time (sec): 85.09 - samples/sec: 211.37 - lr: 0.000075 - momentum: 0.000000
2023-10-07 02:23:39,463 epoch 6 - iter 243/275 - loss 0.06321011 - time (sec): 95.85 - samples/sec: 211.46 - lr: 0.000073 - momentum: 0.000000
2023-10-07 02:23:49,914 epoch 6 - iter 270/275 - loss 0.06222198 - time (sec): 106.30 - samples/sec: 210.65 - lr: 0.000072 - momentum: 0.000000
2023-10-07 02:23:51,737 ----------------------------------------------------------------------------------------------------
2023-10-07 02:23:51,737 EPOCH 6 done: loss 0.0630 - lr: 0.000072
2023-10-07 02:23:58,273 DEV : loss 0.1208251342177391 - f1-score (micro avg) 0.8726
2023-10-07 02:23:58,278 saving best model
2023-10-07 02:23:59,127 ----------------------------------------------------------------------------------------------------
2023-10-07 02:24:09,885 epoch 7 - iter 27/275 - loss 0.03876563 - time (sec): 10.76 - samples/sec: 208.90 - lr: 0.000070 - momentum: 0.000000
2023-10-07 02:24:20,495 epoch 7 - iter 54/275 - loss 0.05629189 - time (sec): 21.37 - samples/sec: 215.00 - lr: 0.000068 - momentum: 0.000000
2023-10-07 02:24:30,012 epoch 7 - iter 81/275 - loss 0.05064682 - time (sec): 30.88 - samples/sec: 210.92 - lr: 0.000066 - momentum: 0.000000
2023-10-07 02:24:40,436 epoch 7 - iter 108/275 - loss 0.05021731 - time (sec): 41.31 - samples/sec: 208.87 - lr: 0.000064 - momentum: 0.000000
2023-10-07 02:24:51,775 epoch 7 - iter 135/275 - loss 0.05655057 - time (sec): 52.65 - samples/sec: 212.34 - lr: 0.000063 - momentum: 0.000000
2023-10-07 02:25:03,073 epoch 7 - iter 162/275 - loss 0.05126893 - time (sec): 63.94 - samples/sec: 212.58 - lr: 0.000061 - momentum: 0.000000
2023-10-07 02:25:13,874 epoch 7 - iter 189/275 - loss 0.05162456 - time (sec): 74.75 - samples/sec: 213.07 - lr: 0.000059 - momentum: 0.000000
2023-10-07 02:25:23,903 epoch 7 - iter 216/275 - loss 0.04951504 - time (sec): 84.77 - samples/sec: 211.15 - lr: 0.000058 - momentum: 0.000000
2023-10-07 02:25:34,373 epoch 7 - iter 243/275 - loss 0.04717364 - time (sec): 95.25 - samples/sec: 210.64 - lr: 0.000056 - momentum: 0.000000
2023-10-07 02:25:45,437 epoch 7 - iter 270/275 - loss 0.04980291 - time (sec): 106.31 - samples/sec: 210.78 - lr: 0.000054 - momentum: 0.000000
2023-10-07 02:25:47,273 ----------------------------------------------------------------------------------------------------
2023-10-07 02:25:47,273 EPOCH 7 done: loss 0.0496 - lr: 0.000054
2023-10-07 02:25:53,830 DEV : loss 0.12672832608222961 - f1-score (micro avg) 0.8795
2023-10-07 02:25:53,835 saving best model
2023-10-07 02:25:54,676 ----------------------------------------------------------------------------------------------------
2023-10-07 02:26:05,375 epoch 8 - iter 27/275 - loss 0.03165259 - time (sec): 10.70 - samples/sec: 209.13 - lr: 0.000052 - momentum: 0.000000
2023-10-07 02:26:15,870 epoch 8 - iter 54/275 - loss 0.02197697 - time (sec): 21.19 - samples/sec: 210.51 - lr: 0.000050 - momentum: 0.000000
2023-10-07 02:26:25,966 epoch 8 - iter 81/275 - loss 0.02864823 - time (sec): 31.29 - samples/sec: 206.34 - lr: 0.000048 - momentum: 0.000000
2023-10-07 02:26:36,737 epoch 8 - iter 108/275 - loss 0.03438549 - time (sec): 42.06 - samples/sec: 205.59 - lr: 0.000047 - momentum: 0.000000
2023-10-07 02:26:47,052 epoch 8 - iter 135/275 - loss 0.04362913 - time (sec): 52.37 - samples/sec: 206.88 - lr: 0.000045 - momentum: 0.000000
2023-10-07 02:26:57,822 epoch 8 - iter 162/275 - loss 0.04486382 - time (sec): 63.14 - samples/sec: 207.68 - lr: 0.000043 - momentum: 0.000000
2023-10-07 02:27:08,720 epoch 8 - iter 189/275 - loss 0.04765935 - time (sec): 74.04 - samples/sec: 208.45 - lr: 0.000042 - momentum: 0.000000
2023-10-07 02:27:19,772 epoch 8 - iter 216/275 - loss 0.04320726 - time (sec): 85.09 - samples/sec: 210.02 - lr: 0.000040 - momentum: 0.000000
2023-10-07 02:27:31,102 epoch 8 - iter 243/275 - loss 0.04126718 - time (sec): 96.42 - samples/sec: 211.05 - lr: 0.000038 - momentum: 0.000000
2023-10-07 02:27:41,170 epoch 8 - iter 270/275 - loss 0.04102295 - time (sec): 106.49 - samples/sec: 209.79 - lr: 0.000036 - momentum: 0.000000
2023-10-07 02:27:43,234 ----------------------------------------------------------------------------------------------------
2023-10-07 02:27:43,234 EPOCH 8 done: loss 0.0408 - lr: 0.000036
2023-10-07 02:27:49,802 DEV : loss 0.1264316886663437 - f1-score (micro avg) 0.8779
2023-10-07 02:27:49,807 ----------------------------------------------------------------------------------------------------
2023-10-07 02:28:00,940 epoch 9 - iter 27/275 - loss 0.00899587 - time (sec): 11.13 - samples/sec: 211.01 - lr: 0.000034 - momentum: 0.000000
2023-10-07 02:28:11,284 epoch 9 - iter 54/275 - loss 0.02817830 - time (sec): 21.48 - samples/sec: 205.02 - lr: 0.000032 - momentum: 0.000000
2023-10-07 02:28:22,131 epoch 9 - iter 81/275 - loss 0.02789991 - time (sec): 32.32 - samples/sec: 206.60 - lr: 0.000031 - momentum: 0.000000
2023-10-07 02:28:32,791 epoch 9 - iter 108/275 - loss 0.03294531 - time (sec): 42.98 - samples/sec: 209.50 - lr: 0.000029 - momentum: 0.000000
2023-10-07 02:28:43,731 epoch 9 - iter 135/275 - loss 0.03520420 - time (sec): 53.92 - samples/sec: 210.73 - lr: 0.000027 - momentum: 0.000000
2023-10-07 02:28:54,910 epoch 9 - iter 162/275 - loss 0.03624598 - time (sec): 65.10 - samples/sec: 211.44 - lr: 0.000026 - momentum: 0.000000
2023-10-07 02:29:05,800 epoch 9 - iter 189/275 - loss 0.03474740 - time (sec): 75.99 - samples/sec: 211.64 - lr: 0.000024 - momentum: 0.000000
2023-10-07 02:29:15,982 epoch 9 - iter 216/275 - loss 0.03378550 - time (sec): 86.17 - samples/sec: 211.19 - lr: 0.000022 - momentum: 0.000000
2023-10-07 02:29:26,369 epoch 9 - iter 243/275 - loss 0.03603380 - time (sec): 96.56 - samples/sec: 210.50 - lr: 0.000020 - momentum: 0.000000
2023-10-07 02:29:36,654 epoch 9 - iter 270/275 - loss 0.03756618 - time (sec): 106.85 - samples/sec: 210.61 - lr: 0.000019 - momentum: 0.000000
2023-10-07 02:29:38,234 ----------------------------------------------------------------------------------------------------
2023-10-07 02:29:38,234 EPOCH 9 done: loss 0.0373 - lr: 0.000019
2023-10-07 02:29:44,751 DEV : loss 0.1213604137301445 - f1-score (micro avg) 0.8828
2023-10-07 02:29:44,756 saving best model
2023-10-07 02:29:45,614 ----------------------------------------------------------------------------------------------------
2023-10-07 02:29:55,956 epoch 10 - iter 27/275 - loss 0.00898985 - time (sec): 10.34 - samples/sec: 210.06 - lr: 0.000017 - momentum: 0.000000
2023-10-07 02:30:06,205 epoch 10 - iter 54/275 - loss 0.01494168 - time (sec): 20.59 - samples/sec: 215.07 - lr: 0.000015 - momentum: 0.000000
2023-10-07 02:30:17,186 epoch 10 - iter 81/275 - loss 0.01826487 - time (sec): 31.57 - samples/sec: 213.68 - lr: 0.000013 - momentum: 0.000000
2023-10-07 02:30:27,975 epoch 10 - iter 108/275 - loss 0.01960483 - time (sec): 42.36 - samples/sec: 214.40 - lr: 0.000011 - momentum: 0.000000
2023-10-07 02:30:38,405 epoch 10 - iter 135/275 - loss 0.02051339 - time (sec): 52.79 - samples/sec: 211.43 - lr: 0.000010 - momentum: 0.000000
2023-10-07 02:30:49,274 epoch 10 - iter 162/275 - loss 0.02458921 - time (sec): 63.66 - samples/sec: 212.10 - lr: 0.000008 - momentum: 0.000000
2023-10-07 02:31:00,291 epoch 10 - iter 189/275 - loss 0.02944180 - time (sec): 74.67 - samples/sec: 212.15 - lr: 0.000006 - momentum: 0.000000
2023-10-07 02:31:11,010 epoch 10 - iter 216/275 - loss 0.03490929 - time (sec): 85.39 - samples/sec: 212.65 - lr: 0.000004 - momentum: 0.000000
2023-10-07 02:31:21,373 epoch 10 - iter 243/275 - loss 0.03412757 - time (sec): 95.76 - samples/sec: 211.91 - lr: 0.000003 - momentum: 0.000000
2023-10-07 02:31:31,934 epoch 10 - iter 270/275 - loss 0.03438963 - time (sec): 106.32 - samples/sec: 210.94 - lr: 0.000001 - momentum: 0.000000
2023-10-07 02:31:33,696 ----------------------------------------------------------------------------------------------------
2023-10-07 02:31:33,696 EPOCH 10 done: loss 0.0339 - lr: 0.000001
2023-10-07 02:31:40,228 DEV : loss 0.12219757586717606 - f1-score (micro avg) 0.8777
2023-10-07 02:31:41,035 ----------------------------------------------------------------------------------------------------
2023-10-07 02:31:41,036 Loading model from best epoch ...
2023-10-07 02:31:43,709 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-07 02:31:50,610
Results:
- F-score (micro) 0.8877
- F-score (macro) 0.732
- Accuracy 0.8134
By class:
precision recall f1-score support
scope 0.8857 0.8807 0.8832 176
pers 0.9440 0.9219 0.9328 128
work 0.8125 0.8784 0.8442 74
object 0.0000 0.0000 0.0000 2
loc 1.0000 1.0000 1.0000 2
micro avg 0.8854 0.8901 0.8877 382
macro avg 0.7284 0.7362 0.7320 382
weighted avg 0.8870 0.8901 0.8882 382
2023-10-07 02:31:50,610 ----------------------------------------------------------------------------------------------------