|
2023-10-25 15:40:49,725 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:40:49,726 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 15:40:49,726 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:40:49,726 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-25 15:40:49,726 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:40:49,726 Train: 20847 sentences |
|
2023-10-25 15:40:49,726 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 15:40:49,727 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:40:49,727 Training Params: |
|
2023-10-25 15:40:49,727 - learning_rate: "5e-05" |
|
2023-10-25 15:40:49,727 - mini_batch_size: "4" |
|
2023-10-25 15:40:49,727 - max_epochs: "10" |
|
2023-10-25 15:40:49,727 - shuffle: "True" |
|
2023-10-25 15:40:49,727 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:40:49,727 Plugins: |
|
2023-10-25 15:40:49,727 - TensorboardLogger |
|
2023-10-25 15:40:49,727 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 15:40:49,727 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:40:49,727 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 15:40:49,727 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 15:40:49,727 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:40:49,727 Computation: |
|
2023-10-25 15:40:49,727 - compute on device: cuda:0 |
|
2023-10-25 15:40:49,727 - embedding storage: none |
|
2023-10-25 15:40:49,727 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:40:49,727 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-25 15:40:49,727 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:40:49,727 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:40:49,727 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 15:41:12,430 epoch 1 - iter 521/5212 - loss 1.24847613 - time (sec): 22.70 - samples/sec: 1631.49 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 15:41:35,742 epoch 1 - iter 1042/5212 - loss 0.79409542 - time (sec): 46.01 - samples/sec: 1589.33 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 15:41:58,495 epoch 1 - iter 1563/5212 - loss 0.62000166 - time (sec): 68.77 - samples/sec: 1620.44 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 15:42:21,224 epoch 1 - iter 2084/5212 - loss 0.52313149 - time (sec): 91.50 - samples/sec: 1620.63 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 15:42:45,174 epoch 1 - iter 2605/5212 - loss 0.46787427 - time (sec): 115.45 - samples/sec: 1636.39 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 15:43:07,482 epoch 1 - iter 3126/5212 - loss 0.42636894 - time (sec): 137.75 - samples/sec: 1634.90 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 15:43:29,865 epoch 1 - iter 3647/5212 - loss 0.39828423 - time (sec): 160.14 - samples/sec: 1640.03 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 15:43:52,495 epoch 1 - iter 4168/5212 - loss 0.37970166 - time (sec): 182.77 - samples/sec: 1635.63 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 15:44:15,495 epoch 1 - iter 4689/5212 - loss 0.37361807 - time (sec): 205.77 - samples/sec: 1614.91 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 15:44:37,828 epoch 1 - iter 5210/5212 - loss 0.36166651 - time (sec): 228.10 - samples/sec: 1610.66 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-25 15:44:37,911 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:44:37,911 EPOCH 1 done: loss 0.3617 - lr: 0.000050 |
|
2023-10-25 15:44:41,574 DEV : loss 0.22080442309379578 - f1-score (micro avg) 0.1438 |
|
2023-10-25 15:44:41,599 saving best model |
|
2023-10-25 15:44:42,080 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:45:05,344 epoch 2 - iter 521/5212 - loss 0.27701476 - time (sec): 23.26 - samples/sec: 1639.01 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 15:45:28,141 epoch 2 - iter 1042/5212 - loss 0.26774174 - time (sec): 46.06 - samples/sec: 1663.32 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 15:45:50,898 epoch 2 - iter 1563/5212 - loss 0.25315314 - time (sec): 68.82 - samples/sec: 1650.18 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 15:46:14,163 epoch 2 - iter 2084/5212 - loss 0.29757522 - time (sec): 92.08 - samples/sec: 1628.91 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 15:46:36,427 epoch 2 - iter 2605/5212 - loss 0.32230692 - time (sec): 114.35 - samples/sec: 1631.59 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 15:46:58,692 epoch 2 - iter 3126/5212 - loss 0.31370599 - time (sec): 136.61 - samples/sec: 1625.37 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 15:47:21,123 epoch 2 - iter 3647/5212 - loss 0.30764574 - time (sec): 159.04 - samples/sec: 1631.06 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 15:47:43,308 epoch 2 - iter 4168/5212 - loss 0.29897371 - time (sec): 181.23 - samples/sec: 1637.77 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 15:48:05,566 epoch 2 - iter 4689/5212 - loss 0.29460317 - time (sec): 203.48 - samples/sec: 1623.40 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 15:48:27,906 epoch 2 - iter 5210/5212 - loss 0.28663517 - time (sec): 225.83 - samples/sec: 1626.67 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 15:48:27,993 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:48:27,993 EPOCH 2 done: loss 0.2867 - lr: 0.000044 |
|
2023-10-25 15:48:35,139 DEV : loss 0.15546594560146332 - f1-score (micro avg) 0.2256 |
|
2023-10-25 15:48:35,165 saving best model |
|
2023-10-25 15:48:35,768 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:48:57,895 epoch 3 - iter 521/5212 - loss 0.30169616 - time (sec): 22.12 - samples/sec: 1661.09 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 15:49:20,057 epoch 3 - iter 1042/5212 - loss 0.30759449 - time (sec): 44.29 - samples/sec: 1617.71 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 15:49:42,707 epoch 3 - iter 1563/5212 - loss 0.27595502 - time (sec): 66.94 - samples/sec: 1612.63 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 15:50:05,056 epoch 3 - iter 2084/5212 - loss 0.24782471 - time (sec): 89.29 - samples/sec: 1632.91 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 15:50:27,628 epoch 3 - iter 2605/5212 - loss 0.23028822 - time (sec): 111.86 - samples/sec: 1639.48 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 15:50:48,988 epoch 3 - iter 3126/5212 - loss 0.22726832 - time (sec): 133.22 - samples/sec: 1652.52 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 15:51:12,671 epoch 3 - iter 3647/5212 - loss 0.22179543 - time (sec): 156.90 - samples/sec: 1629.53 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 15:51:34,997 epoch 3 - iter 4168/5212 - loss 0.21682299 - time (sec): 179.23 - samples/sec: 1633.25 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 15:51:57,314 epoch 3 - iter 4689/5212 - loss 0.21234182 - time (sec): 201.54 - samples/sec: 1625.65 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 15:52:19,667 epoch 3 - iter 5210/5212 - loss 0.20687427 - time (sec): 223.90 - samples/sec: 1640.40 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 15:52:19,759 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:52:19,759 EPOCH 3 done: loss 0.2068 - lr: 0.000039 |
|
2023-10-25 15:52:26,616 DEV : loss 0.24288956820964813 - f1-score (micro avg) 0.2767 |
|
2023-10-25 15:52:26,640 saving best model |
|
2023-10-25 15:52:27,244 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:52:49,976 epoch 4 - iter 521/5212 - loss 0.14541617 - time (sec): 22.73 - samples/sec: 1594.73 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 15:53:12,334 epoch 4 - iter 1042/5212 - loss 0.14059672 - time (sec): 45.09 - samples/sec: 1612.24 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 15:53:34,746 epoch 4 - iter 1563/5212 - loss 0.14336781 - time (sec): 67.50 - samples/sec: 1619.62 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 15:53:56,360 epoch 4 - iter 2084/5212 - loss 0.15113922 - time (sec): 89.11 - samples/sec: 1613.43 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 15:54:18,268 epoch 4 - iter 2605/5212 - loss 0.15623749 - time (sec): 111.02 - samples/sec: 1607.30 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 15:54:40,720 epoch 4 - iter 3126/5212 - loss 0.15743031 - time (sec): 133.47 - samples/sec: 1639.68 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 15:55:02,499 epoch 4 - iter 3647/5212 - loss 0.15737676 - time (sec): 155.25 - samples/sec: 1630.29 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 15:55:24,814 epoch 4 - iter 4168/5212 - loss 0.16195469 - time (sec): 177.57 - samples/sec: 1635.41 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 15:55:46,893 epoch 4 - iter 4689/5212 - loss 0.16745988 - time (sec): 199.65 - samples/sec: 1638.43 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 15:56:09,860 epoch 4 - iter 5210/5212 - loss 0.16610419 - time (sec): 222.61 - samples/sec: 1649.17 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 15:56:09,961 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:56:09,961 EPOCH 4 done: loss 0.1662 - lr: 0.000033 |
|
2023-10-25 15:56:16,861 DEV : loss 0.23563335835933685 - f1-score (micro avg) 0.3339 |
|
2023-10-25 15:56:16,886 saving best model |
|
2023-10-25 15:56:17,503 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:56:40,112 epoch 5 - iter 521/5212 - loss 0.16272851 - time (sec): 22.60 - samples/sec: 1648.41 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 15:57:02,345 epoch 5 - iter 1042/5212 - loss 0.15512717 - time (sec): 44.84 - samples/sec: 1621.45 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 15:57:24,778 epoch 5 - iter 1563/5212 - loss 0.15269704 - time (sec): 67.27 - samples/sec: 1649.97 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 15:57:46,753 epoch 5 - iter 2084/5212 - loss 0.16050403 - time (sec): 89.25 - samples/sec: 1661.90 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 15:58:08,939 epoch 5 - iter 2605/5212 - loss 0.16890836 - time (sec): 111.43 - samples/sec: 1645.12 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 15:58:30,542 epoch 5 - iter 3126/5212 - loss 0.16716994 - time (sec): 133.03 - samples/sec: 1650.66 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 15:58:52,733 epoch 5 - iter 3647/5212 - loss 0.16808642 - time (sec): 155.23 - samples/sec: 1651.60 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 15:59:14,632 epoch 5 - iter 4168/5212 - loss 0.16952321 - time (sec): 177.12 - samples/sec: 1661.77 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 15:59:36,952 epoch 5 - iter 4689/5212 - loss 0.17072890 - time (sec): 199.44 - samples/sec: 1670.87 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 15:59:58,912 epoch 5 - iter 5210/5212 - loss 0.17071298 - time (sec): 221.40 - samples/sec: 1659.36 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 15:59:58,988 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:59:58,989 EPOCH 5 done: loss 0.1707 - lr: 0.000028 |
|
2023-10-25 16:00:05,971 DEV : loss 0.2225915640592575 - f1-score (micro avg) 0.2691 |
|
2023-10-25 16:00:05,997 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:00:28,448 epoch 6 - iter 521/5212 - loss 0.14161427 - time (sec): 22.45 - samples/sec: 1756.92 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 16:00:50,604 epoch 6 - iter 1042/5212 - loss 0.13835517 - time (sec): 44.61 - samples/sec: 1765.39 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 16:01:12,353 epoch 6 - iter 1563/5212 - loss 0.14662044 - time (sec): 66.35 - samples/sec: 1706.62 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 16:01:33,931 epoch 6 - iter 2084/5212 - loss 0.16265685 - time (sec): 87.93 - samples/sec: 1700.00 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 16:01:55,994 epoch 6 - iter 2605/5212 - loss 0.16371044 - time (sec): 110.00 - samples/sec: 1661.81 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 16:02:17,993 epoch 6 - iter 3126/5212 - loss 0.16618079 - time (sec): 131.99 - samples/sec: 1661.11 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 16:02:40,063 epoch 6 - iter 3647/5212 - loss 0.16564198 - time (sec): 154.06 - samples/sec: 1666.52 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 16:03:02,499 epoch 6 - iter 4168/5212 - loss 0.16634156 - time (sec): 176.50 - samples/sec: 1667.11 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 16:03:24,979 epoch 6 - iter 4689/5212 - loss 0.16502704 - time (sec): 198.98 - samples/sec: 1667.40 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 16:03:46,903 epoch 6 - iter 5210/5212 - loss 0.16841155 - time (sec): 220.90 - samples/sec: 1662.80 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 16:03:46,983 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:03:46,983 EPOCH 6 done: loss 0.1684 - lr: 0.000022 |
|
2023-10-25 16:03:53,850 DEV : loss 0.20808175206184387 - f1-score (micro avg) 0.2423 |
|
2023-10-25 16:03:53,875 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:04:16,243 epoch 7 - iter 521/5212 - loss 0.16586561 - time (sec): 22.37 - samples/sec: 1558.29 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 16:04:38,816 epoch 7 - iter 1042/5212 - loss 0.14976359 - time (sec): 44.94 - samples/sec: 1600.71 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 16:05:00,525 epoch 7 - iter 1563/5212 - loss 0.14294447 - time (sec): 66.65 - samples/sec: 1584.38 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 16:05:22,503 epoch 7 - iter 2084/5212 - loss 0.14958504 - time (sec): 88.63 - samples/sec: 1605.49 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 16:05:44,352 epoch 7 - iter 2605/5212 - loss 0.15272797 - time (sec): 110.48 - samples/sec: 1622.73 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 16:06:07,447 epoch 7 - iter 3126/5212 - loss 0.15256052 - time (sec): 133.57 - samples/sec: 1647.28 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 16:06:29,677 epoch 7 - iter 3647/5212 - loss 0.15252498 - time (sec): 155.80 - samples/sec: 1675.50 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 16:06:51,711 epoch 7 - iter 4168/5212 - loss 0.15070024 - time (sec): 177.83 - samples/sec: 1676.15 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 16:07:13,611 epoch 7 - iter 4689/5212 - loss 0.15136587 - time (sec): 199.73 - samples/sec: 1674.35 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 16:07:35,780 epoch 7 - iter 5210/5212 - loss 0.15018887 - time (sec): 221.90 - samples/sec: 1654.74 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 16:07:35,873 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:07:35,873 EPOCH 7 done: loss 0.1502 - lr: 0.000017 |
|
2023-10-25 16:07:42,032 DEV : loss 0.2270050346851349 - f1-score (micro avg) 0.2385 |
|
2023-10-25 16:07:42,058 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:08:04,521 epoch 8 - iter 521/5212 - loss 0.14589267 - time (sec): 22.46 - samples/sec: 1749.06 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 16:08:26,536 epoch 8 - iter 1042/5212 - loss 0.12679147 - time (sec): 44.48 - samples/sec: 1648.21 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 16:08:49,455 epoch 8 - iter 1563/5212 - loss 0.13027312 - time (sec): 67.40 - samples/sec: 1629.21 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 16:09:11,507 epoch 8 - iter 2084/5212 - loss 0.13178230 - time (sec): 89.45 - samples/sec: 1632.23 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 16:09:33,603 epoch 8 - iter 2605/5212 - loss 0.13370436 - time (sec): 111.54 - samples/sec: 1649.83 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 16:09:55,760 epoch 8 - iter 3126/5212 - loss 0.13098280 - time (sec): 133.70 - samples/sec: 1673.75 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 16:10:17,942 epoch 8 - iter 3647/5212 - loss 0.13384808 - time (sec): 155.88 - samples/sec: 1659.85 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 16:10:40,208 epoch 8 - iter 4168/5212 - loss 0.13575297 - time (sec): 178.15 - samples/sec: 1686.01 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 16:11:01,988 epoch 8 - iter 4689/5212 - loss 0.13594078 - time (sec): 199.93 - samples/sec: 1671.82 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 16:11:23,871 epoch 8 - iter 5210/5212 - loss 0.13594037 - time (sec): 221.81 - samples/sec: 1656.28 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 16:11:23,947 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:11:23,947 EPOCH 8 done: loss 0.1359 - lr: 0.000011 |
|
2023-10-25 16:11:30,260 DEV : loss 0.2554771304130554 - f1-score (micro avg) 0.2288 |
|
2023-10-25 16:11:30,286 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:11:52,419 epoch 9 - iter 521/5212 - loss 0.11567060 - time (sec): 22.13 - samples/sec: 1548.27 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 16:12:14,779 epoch 9 - iter 1042/5212 - loss 0.12925166 - time (sec): 44.49 - samples/sec: 1633.21 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 16:12:36,602 epoch 9 - iter 1563/5212 - loss 0.12394185 - time (sec): 66.31 - samples/sec: 1646.06 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 16:12:58,308 epoch 9 - iter 2084/5212 - loss 0.12623309 - time (sec): 88.02 - samples/sec: 1642.85 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 16:13:21,115 epoch 9 - iter 2605/5212 - loss 0.12686418 - time (sec): 110.83 - samples/sec: 1621.52 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 16:13:43,274 epoch 9 - iter 3126/5212 - loss 0.12617219 - time (sec): 132.99 - samples/sec: 1632.06 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 16:14:06,024 epoch 9 - iter 3647/5212 - loss 0.12470505 - time (sec): 155.74 - samples/sec: 1630.26 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 16:14:28,476 epoch 9 - iter 4168/5212 - loss 0.12338268 - time (sec): 178.19 - samples/sec: 1647.55 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 16:14:50,431 epoch 9 - iter 4689/5212 - loss 0.12303248 - time (sec): 200.14 - samples/sec: 1655.20 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 16:15:13,282 epoch 9 - iter 5210/5212 - loss 0.12380050 - time (sec): 222.99 - samples/sec: 1647.36 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 16:15:13,380 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:15:13,380 EPOCH 9 done: loss 0.1238 - lr: 0.000006 |
|
2023-10-25 16:15:20,024 DEV : loss 0.25904580950737 - f1-score (micro avg) 0.2442 |
|
2023-10-25 16:15:20,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:15:42,601 epoch 10 - iter 521/5212 - loss 0.09950056 - time (sec): 22.55 - samples/sec: 1569.28 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 16:16:05,408 epoch 10 - iter 1042/5212 - loss 0.09962004 - time (sec): 45.36 - samples/sec: 1621.04 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 16:16:28,632 epoch 10 - iter 1563/5212 - loss 0.10130806 - time (sec): 68.58 - samples/sec: 1566.23 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 16:16:50,806 epoch 10 - iter 2084/5212 - loss 0.10676368 - time (sec): 90.75 - samples/sec: 1600.53 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 16:17:12,875 epoch 10 - iter 2605/5212 - loss 0.11022598 - time (sec): 112.82 - samples/sec: 1629.67 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 16:17:35,096 epoch 10 - iter 3126/5212 - loss 0.10822074 - time (sec): 135.04 - samples/sec: 1640.39 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 16:17:57,698 epoch 10 - iter 3647/5212 - loss 0.11044716 - time (sec): 157.65 - samples/sec: 1638.54 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 16:18:20,171 epoch 10 - iter 4168/5212 - loss 0.11138827 - time (sec): 180.12 - samples/sec: 1634.56 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 16:18:42,613 epoch 10 - iter 4689/5212 - loss 0.11209184 - time (sec): 202.56 - samples/sec: 1624.26 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 16:19:04,757 epoch 10 - iter 5210/5212 - loss 0.11119727 - time (sec): 224.70 - samples/sec: 1634.63 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 16:19:04,836 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:19:04,836 EPOCH 10 done: loss 0.1112 - lr: 0.000000 |
|
2023-10-25 16:19:11,706 DEV : loss 0.2723826766014099 - f1-score (micro avg) 0.232 |
|
2023-10-25 16:19:12,230 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:19:12,231 Loading model from best epoch ... |
|
2023-10-25 16:19:13,972 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 16:19:23,755 |
|
Results: |
|
- F-score (micro) 0.3408 |
|
- F-score (macro) 0.2134 |
|
- Accuracy 0.2086 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.4309 0.4876 0.4575 1214 |
|
PER 0.2575 0.2970 0.2759 808 |
|
ORG 0.1042 0.1416 0.1200 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.3166 0.3690 0.3408 2390 |
|
macro avg 0.1981 0.2316 0.2134 2390 |
|
weighted avg 0.3213 0.3690 0.3434 2390 |
|
|
|
2023-10-25 16:19:23,755 ---------------------------------------------------------------------------------------------------- |
|
|