stefan-it's picture
Upload folder using huggingface_hub
ec16760
raw
history blame
24 kB
2023-10-13 08:30:02,855 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:02,856 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 08:30:02,856 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:02,856 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-13 08:30:02,856 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:02,856 Train: 1100 sentences
2023-10-13 08:30:02,856 (train_with_dev=False, train_with_test=False)
2023-10-13 08:30:02,856 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:02,856 Training Params:
2023-10-13 08:30:02,856 - learning_rate: "3e-05"
2023-10-13 08:30:02,856 - mini_batch_size: "4"
2023-10-13 08:30:02,856 - max_epochs: "10"
2023-10-13 08:30:02,856 - shuffle: "True"
2023-10-13 08:30:02,856 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:02,856 Plugins:
2023-10-13 08:30:02,856 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 08:30:02,856 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:02,856 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 08:30:02,856 - metric: "('micro avg', 'f1-score')"
2023-10-13 08:30:02,856 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:02,856 Computation:
2023-10-13 08:30:02,856 - compute on device: cuda:0
2023-10-13 08:30:02,856 - embedding storage: none
2023-10-13 08:30:02,856 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:02,856 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-13 08:30:02,856 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:02,856 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:04,144 epoch 1 - iter 27/275 - loss 3.12256091 - time (sec): 1.29 - samples/sec: 1881.91 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:30:05,494 epoch 1 - iter 54/275 - loss 2.73588781 - time (sec): 2.64 - samples/sec: 1721.27 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:30:06,747 epoch 1 - iter 81/275 - loss 2.11741342 - time (sec): 3.89 - samples/sec: 1759.29 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:30:07,941 epoch 1 - iter 108/275 - loss 1.76793211 - time (sec): 5.08 - samples/sec: 1777.11 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:30:09,143 epoch 1 - iter 135/275 - loss 1.56633956 - time (sec): 6.29 - samples/sec: 1795.64 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:30:10,376 epoch 1 - iter 162/275 - loss 1.42899417 - time (sec): 7.52 - samples/sec: 1764.94 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:30:11,623 epoch 1 - iter 189/275 - loss 1.28375594 - time (sec): 8.77 - samples/sec: 1790.74 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:30:12,860 epoch 1 - iter 216/275 - loss 1.16818959 - time (sec): 10.00 - samples/sec: 1784.82 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:30:14,096 epoch 1 - iter 243/275 - loss 1.08409111 - time (sec): 11.24 - samples/sec: 1787.90 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:30:15,283 epoch 1 - iter 270/275 - loss 1.00861787 - time (sec): 12.43 - samples/sec: 1792.64 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:30:15,505 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:15,505 EPOCH 1 done: loss 0.9973 - lr: 0.000029
2023-10-13 08:30:16,318 DEV : loss 0.23283608257770538 - f1-score (micro avg) 0.6629
2023-10-13 08:30:16,323 saving best model
2023-10-13 08:30:16,660 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:17,809 epoch 2 - iter 27/275 - loss 0.20631712 - time (sec): 1.15 - samples/sec: 1806.74 - lr: 0.000030 - momentum: 0.000000
2023-10-13 08:30:18,998 epoch 2 - iter 54/275 - loss 0.22953289 - time (sec): 2.34 - samples/sec: 1904.20 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:30:20,152 epoch 2 - iter 81/275 - loss 0.22170780 - time (sec): 3.49 - samples/sec: 1937.42 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:30:21,374 epoch 2 - iter 108/275 - loss 0.21397525 - time (sec): 4.71 - samples/sec: 1972.24 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:30:22,571 epoch 2 - iter 135/275 - loss 0.21197127 - time (sec): 5.91 - samples/sec: 1977.94 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:30:23,747 epoch 2 - iter 162/275 - loss 0.20464864 - time (sec): 7.09 - samples/sec: 1956.91 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:30:24,930 epoch 2 - iter 189/275 - loss 0.20054963 - time (sec): 8.27 - samples/sec: 1933.60 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:30:26,120 epoch 2 - iter 216/275 - loss 0.19706410 - time (sec): 9.46 - samples/sec: 1930.90 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:30:27,256 epoch 2 - iter 243/275 - loss 0.18940321 - time (sec): 10.59 - samples/sec: 1928.02 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:30:28,416 epoch 2 - iter 270/275 - loss 0.18411801 - time (sec): 11.75 - samples/sec: 1902.18 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:30:28,646 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:28,647 EPOCH 2 done: loss 0.1857 - lr: 0.000027
2023-10-13 08:30:29,336 DEV : loss 0.13876736164093018 - f1-score (micro avg) 0.8377
2023-10-13 08:30:29,342 saving best model
2023-10-13 08:30:29,797 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:31,028 epoch 3 - iter 27/275 - loss 0.08676695 - time (sec): 1.23 - samples/sec: 1865.67 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:30:32,243 epoch 3 - iter 54/275 - loss 0.11105096 - time (sec): 2.44 - samples/sec: 1900.57 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:30:33,443 epoch 3 - iter 81/275 - loss 0.11026128 - time (sec): 3.64 - samples/sec: 1894.97 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:30:34,635 epoch 3 - iter 108/275 - loss 0.09728597 - time (sec): 4.84 - samples/sec: 1877.65 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:30:35,834 epoch 3 - iter 135/275 - loss 0.11057918 - time (sec): 6.03 - samples/sec: 1905.70 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:30:37,014 epoch 3 - iter 162/275 - loss 0.11211846 - time (sec): 7.21 - samples/sec: 1878.86 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:30:38,268 epoch 3 - iter 189/275 - loss 0.10686033 - time (sec): 8.47 - samples/sec: 1866.84 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:30:39,465 epoch 3 - iter 216/275 - loss 0.10678281 - time (sec): 9.66 - samples/sec: 1858.69 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:30:40,664 epoch 3 - iter 243/275 - loss 0.10493191 - time (sec): 10.86 - samples/sec: 1867.30 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:30:41,872 epoch 3 - iter 270/275 - loss 0.10312391 - time (sec): 12.07 - samples/sec: 1855.13 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:30:42,095 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:42,096 EPOCH 3 done: loss 0.1019 - lr: 0.000023
2023-10-13 08:30:42,829 DEV : loss 0.1486443281173706 - f1-score (micro avg) 0.8523
2023-10-13 08:30:42,836 saving best model
2023-10-13 08:30:43,371 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:44,793 epoch 4 - iter 27/275 - loss 0.07441677 - time (sec): 1.42 - samples/sec: 1577.23 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:30:46,179 epoch 4 - iter 54/275 - loss 0.08226076 - time (sec): 2.80 - samples/sec: 1570.96 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:30:47,582 epoch 4 - iter 81/275 - loss 0.06455288 - time (sec): 4.20 - samples/sec: 1614.32 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:30:48,954 epoch 4 - iter 108/275 - loss 0.07124220 - time (sec): 5.58 - samples/sec: 1586.17 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:30:50,353 epoch 4 - iter 135/275 - loss 0.07867153 - time (sec): 6.98 - samples/sec: 1598.33 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:30:51,688 epoch 4 - iter 162/275 - loss 0.07905647 - time (sec): 8.31 - samples/sec: 1617.34 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:30:53,061 epoch 4 - iter 189/275 - loss 0.07773678 - time (sec): 9.68 - samples/sec: 1621.71 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:30:54,274 epoch 4 - iter 216/275 - loss 0.07725977 - time (sec): 10.90 - samples/sec: 1632.74 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:30:55,444 epoch 4 - iter 243/275 - loss 0.07933465 - time (sec): 12.07 - samples/sec: 1642.09 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:30:56,612 epoch 4 - iter 270/275 - loss 0.07839244 - time (sec): 13.23 - samples/sec: 1689.29 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:30:56,828 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:56,829 EPOCH 4 done: loss 0.0774 - lr: 0.000020
2023-10-13 08:30:57,572 DEV : loss 0.15350772440433502 - f1-score (micro avg) 0.8758
2023-10-13 08:30:57,578 saving best model
2023-10-13 08:30:58,200 ----------------------------------------------------------------------------------------------------
2023-10-13 08:30:59,595 epoch 5 - iter 27/275 - loss 0.07548554 - time (sec): 1.39 - samples/sec: 1754.44 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:31:01,038 epoch 5 - iter 54/275 - loss 0.06764304 - time (sec): 2.84 - samples/sec: 1684.43 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:31:02,436 epoch 5 - iter 81/275 - loss 0.06984076 - time (sec): 4.23 - samples/sec: 1630.67 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:31:03,882 epoch 5 - iter 108/275 - loss 0.06004683 - time (sec): 5.68 - samples/sec: 1601.75 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:31:05,333 epoch 5 - iter 135/275 - loss 0.06436743 - time (sec): 7.13 - samples/sec: 1583.01 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:31:06,783 epoch 5 - iter 162/275 - loss 0.05845132 - time (sec): 8.58 - samples/sec: 1550.48 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:31:08,243 epoch 5 - iter 189/275 - loss 0.05821264 - time (sec): 10.04 - samples/sec: 1552.31 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:31:09,661 epoch 5 - iter 216/275 - loss 0.06126030 - time (sec): 11.46 - samples/sec: 1545.09 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:31:11,113 epoch 5 - iter 243/275 - loss 0.05900075 - time (sec): 12.91 - samples/sec: 1561.69 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:31:12,497 epoch 5 - iter 270/275 - loss 0.05727806 - time (sec): 14.30 - samples/sec: 1563.51 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:31:12,765 ----------------------------------------------------------------------------------------------------
2023-10-13 08:31:12,765 EPOCH 5 done: loss 0.0604 - lr: 0.000017
2023-10-13 08:31:13,425 DEV : loss 0.15178388357162476 - f1-score (micro avg) 0.8668
2023-10-13 08:31:13,430 ----------------------------------------------------------------------------------------------------
2023-10-13 08:31:14,809 epoch 6 - iter 27/275 - loss 0.04081421 - time (sec): 1.38 - samples/sec: 1709.43 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:31:16,072 epoch 6 - iter 54/275 - loss 0.04499500 - time (sec): 2.64 - samples/sec: 1795.16 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:31:17,259 epoch 6 - iter 81/275 - loss 0.04264265 - time (sec): 3.83 - samples/sec: 1755.59 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:31:18,451 epoch 6 - iter 108/275 - loss 0.04066899 - time (sec): 5.02 - samples/sec: 1764.48 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:31:19,876 epoch 6 - iter 135/275 - loss 0.04266846 - time (sec): 6.44 - samples/sec: 1728.64 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:31:21,313 epoch 6 - iter 162/275 - loss 0.03859168 - time (sec): 7.88 - samples/sec: 1690.96 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:31:22,664 epoch 6 - iter 189/275 - loss 0.04050303 - time (sec): 9.23 - samples/sec: 1669.95 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:31:23,900 epoch 6 - iter 216/275 - loss 0.03833768 - time (sec): 10.47 - samples/sec: 1692.11 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:31:25,180 epoch 6 - iter 243/275 - loss 0.04402705 - time (sec): 11.75 - samples/sec: 1718.16 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:31:26,464 epoch 6 - iter 270/275 - loss 0.04151386 - time (sec): 13.03 - samples/sec: 1721.61 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:31:26,680 ----------------------------------------------------------------------------------------------------
2023-10-13 08:31:26,680 EPOCH 6 done: loss 0.0422 - lr: 0.000013
2023-10-13 08:31:27,425 DEV : loss 0.15648043155670166 - f1-score (micro avg) 0.8775
2023-10-13 08:31:27,431 saving best model
2023-10-13 08:31:28,034 ----------------------------------------------------------------------------------------------------
2023-10-13 08:31:29,450 epoch 7 - iter 27/275 - loss 0.04169935 - time (sec): 1.41 - samples/sec: 1607.86 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:31:30,721 epoch 7 - iter 54/275 - loss 0.02824331 - time (sec): 2.68 - samples/sec: 1592.78 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:31:31,988 epoch 7 - iter 81/275 - loss 0.03815050 - time (sec): 3.95 - samples/sec: 1679.31 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:31:33,215 epoch 7 - iter 108/275 - loss 0.02987254 - time (sec): 5.18 - samples/sec: 1722.11 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:31:34,512 epoch 7 - iter 135/275 - loss 0.03424007 - time (sec): 6.47 - samples/sec: 1730.27 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:31:35,896 epoch 7 - iter 162/275 - loss 0.03450026 - time (sec): 7.86 - samples/sec: 1708.45 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:31:37,308 epoch 7 - iter 189/275 - loss 0.03417843 - time (sec): 9.27 - samples/sec: 1706.92 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:31:38,693 epoch 7 - iter 216/275 - loss 0.03320135 - time (sec): 10.66 - samples/sec: 1678.75 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:31:40,145 epoch 7 - iter 243/275 - loss 0.02986744 - time (sec): 12.11 - samples/sec: 1667.99 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:31:41,528 epoch 7 - iter 270/275 - loss 0.03255562 - time (sec): 13.49 - samples/sec: 1657.39 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:31:41,742 ----------------------------------------------------------------------------------------------------
2023-10-13 08:31:41,743 EPOCH 7 done: loss 0.0320 - lr: 0.000010
2023-10-13 08:31:42,446 DEV : loss 0.1622074544429779 - f1-score (micro avg) 0.8645
2023-10-13 08:31:42,453 ----------------------------------------------------------------------------------------------------
2023-10-13 08:31:43,828 epoch 8 - iter 27/275 - loss 0.01838056 - time (sec): 1.37 - samples/sec: 1668.51 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:31:45,182 epoch 8 - iter 54/275 - loss 0.02235674 - time (sec): 2.73 - samples/sec: 1620.96 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:31:46,523 epoch 8 - iter 81/275 - loss 0.01827360 - time (sec): 4.07 - samples/sec: 1652.60 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:31:47,933 epoch 8 - iter 108/275 - loss 0.02429007 - time (sec): 5.48 - samples/sec: 1669.54 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:31:49,349 epoch 8 - iter 135/275 - loss 0.02657182 - time (sec): 6.89 - samples/sec: 1646.59 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:31:50,782 epoch 8 - iter 162/275 - loss 0.02556321 - time (sec): 8.33 - samples/sec: 1634.43 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:31:52,192 epoch 8 - iter 189/275 - loss 0.02914361 - time (sec): 9.74 - samples/sec: 1624.79 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:31:53,585 epoch 8 - iter 216/275 - loss 0.02779016 - time (sec): 11.13 - samples/sec: 1596.03 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:31:55,003 epoch 8 - iter 243/275 - loss 0.02659038 - time (sec): 12.55 - samples/sec: 1597.71 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:31:56,388 epoch 8 - iter 270/275 - loss 0.02472247 - time (sec): 13.93 - samples/sec: 1599.82 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:31:56,638 ----------------------------------------------------------------------------------------------------
2023-10-13 08:31:56,638 EPOCH 8 done: loss 0.0265 - lr: 0.000007
2023-10-13 08:31:57,338 DEV : loss 0.15169711410999298 - f1-score (micro avg) 0.894
2023-10-13 08:31:57,344 saving best model
2023-10-13 08:31:57,896 ----------------------------------------------------------------------------------------------------
2023-10-13 08:31:59,298 epoch 9 - iter 27/275 - loss 0.02854229 - time (sec): 1.40 - samples/sec: 1622.74 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:32:00,683 epoch 9 - iter 54/275 - loss 0.01793449 - time (sec): 2.79 - samples/sec: 1561.20 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:32:02,087 epoch 9 - iter 81/275 - loss 0.01751264 - time (sec): 4.19 - samples/sec: 1572.81 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:32:03,475 epoch 9 - iter 108/275 - loss 0.01705168 - time (sec): 5.58 - samples/sec: 1531.77 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:32:04,888 epoch 9 - iter 135/275 - loss 0.02420730 - time (sec): 6.99 - samples/sec: 1593.00 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:32:06,284 epoch 9 - iter 162/275 - loss 0.02315029 - time (sec): 8.39 - samples/sec: 1623.26 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:32:07,954 epoch 9 - iter 189/275 - loss 0.02030702 - time (sec): 10.06 - samples/sec: 1574.07 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:32:09,750 epoch 9 - iter 216/275 - loss 0.02223492 - time (sec): 11.85 - samples/sec: 1521.39 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:32:11,140 epoch 9 - iter 243/275 - loss 0.02034742 - time (sec): 13.24 - samples/sec: 1520.43 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:32:12,486 epoch 9 - iter 270/275 - loss 0.02192502 - time (sec): 14.59 - samples/sec: 1530.07 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:32:12,733 ----------------------------------------------------------------------------------------------------
2023-10-13 08:32:12,733 EPOCH 9 done: loss 0.0216 - lr: 0.000003
2023-10-13 08:32:13,467 DEV : loss 0.14888478815555573 - f1-score (micro avg) 0.8951
2023-10-13 08:32:13,474 saving best model
2023-10-13 08:32:14,259 ----------------------------------------------------------------------------------------------------
2023-10-13 08:32:15,685 epoch 10 - iter 27/275 - loss 0.03725164 - time (sec): 1.42 - samples/sec: 1607.61 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:32:17,128 epoch 10 - iter 54/275 - loss 0.04843116 - time (sec): 2.87 - samples/sec: 1594.73 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:32:18,538 epoch 10 - iter 81/275 - loss 0.03243751 - time (sec): 4.28 - samples/sec: 1633.39 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:32:20,014 epoch 10 - iter 108/275 - loss 0.03059973 - time (sec): 5.75 - samples/sec: 1584.55 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:32:21,378 epoch 10 - iter 135/275 - loss 0.02487169 - time (sec): 7.12 - samples/sec: 1582.45 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:32:22,789 epoch 10 - iter 162/275 - loss 0.02418583 - time (sec): 8.53 - samples/sec: 1616.61 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:32:24,210 epoch 10 - iter 189/275 - loss 0.02168156 - time (sec): 9.95 - samples/sec: 1595.54 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:32:25,575 epoch 10 - iter 216/275 - loss 0.01915487 - time (sec): 11.31 - samples/sec: 1591.14 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:32:26,927 epoch 10 - iter 243/275 - loss 0.01800133 - time (sec): 12.67 - samples/sec: 1585.16 - lr: 0.000000 - momentum: 0.000000
2023-10-13 08:32:28,275 epoch 10 - iter 270/275 - loss 0.01827690 - time (sec): 14.01 - samples/sec: 1589.32 - lr: 0.000000 - momentum: 0.000000
2023-10-13 08:32:28,531 ----------------------------------------------------------------------------------------------------
2023-10-13 08:32:28,531 EPOCH 10 done: loss 0.0179 - lr: 0.000000
2023-10-13 08:32:29,219 DEV : loss 0.15245532989501953 - f1-score (micro avg) 0.894
2023-10-13 08:32:29,686 ----------------------------------------------------------------------------------------------------
2023-10-13 08:32:29,687 Loading model from best epoch ...
2023-10-13 08:32:31,411 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-13 08:32:32,043
Results:
- F-score (micro) 0.9217
- F-score (macro) 0.8466
- Accuracy 0.8652
By class:
precision recall f1-score support
scope 0.8950 0.9205 0.9076 176
pers 0.9683 0.9531 0.9606 128
work 0.9041 0.8919 0.8980 74
loc 0.6667 1.0000 0.8000 2
object 1.0000 0.5000 0.6667 2
micro avg 0.9193 0.9241 0.9217 382
macro avg 0.8868 0.8531 0.8466 382
weighted avg 0.9207 0.9241 0.9217 382
2023-10-13 08:32:32,043 ----------------------------------------------------------------------------------------------------