stefan-it's picture
model: add fine-tuned model
8cb3d0b verified
raw
history blame
42 kB
2024-09-03 22:08:14,383 ----------------------------------------------------------------------------------------------------
2024-09-03 22:08:14,384 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(31103, 1024, padding_idx=0)
(position_embeddings): Embedding(512, 1024)
(token_type_embeddings): Embedding(2, 1024)
(LayerNorm): LayerNorm((1024,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-23): 24 x BertLayer(
(attention): BertAttention(
(self): BertSdpaSelfAttention(
(query): Linear(in_features=1024, out_features=1024, bias=True)
(key): Linear(in_features=1024, out_features=1024, bias=True)
(value): Linear(in_features=1024, out_features=1024, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=1024, out_features=1024, bias=True)
(LayerNorm): LayerNorm((1024,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=1024, out_features=4096, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=4096, out_features=1024, bias=True)
(LayerNorm): LayerNorm((1024,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=1024, out_features=1024, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1024, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
2024-09-03 22:08:14,384 Corpus: 2869 train + 338 dev + 370 test sentences
2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
2024-09-03 22:08:14,384 Train: 2869 sentences
2024-09-03 22:08:14,384 (train_with_dev=False, train_with_test=False)
2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
2024-09-03 22:08:14,384 Training Params:
2024-09-03 22:08:14,384 - learning_rate: "1e-05"
2024-09-03 22:08:14,384 - mini_batch_size: "32"
2024-09-03 22:08:14,384 - max_epochs: "20"
2024-09-03 22:08:14,384 - shuffle: "True"
2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
2024-09-03 22:08:14,384 Plugins:
2024-09-03 22:08:14,384 - TensorboardLogger
2024-09-03 22:08:14,384 - LinearScheduler | warmup_fraction: '0.1'
2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
2024-09-03 22:08:14,384 Final evaluation on model from best epoch (best-model.pt)
2024-09-03 22:08:14,384 - metric: "('micro avg', 'f1-score')"
2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
2024-09-03 22:08:14,384 Computation:
2024-09-03 22:08:14,384 - compute on device: cuda:0
2024-09-03 22:08:14,384 - embedding storage: none
2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
2024-09-03 22:08:14,384 Model training base path: "flair-barner-coarse-grained-gbert_large-bs32-e20-lr1e-05-2"
2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
2024-09-03 22:08:14,385 ----------------------------------------------------------------------------------------------------
2024-09-03 22:08:14,385 Logging anything other than scalars to TensorBoard is currently not supported.
2024-09-03 22:08:18,474 epoch 1 - iter 9/90 - loss 3.07625880 - time (sec): 4.09 - samples/sec: 1570.98 - lr: 0.000000 - momentum: 0.000000
2024-09-03 22:08:22,848 epoch 1 - iter 18/90 - loss 3.03830863 - time (sec): 8.46 - samples/sec: 1486.84 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:08:27,463 epoch 1 - iter 27/90 - loss 2.92345816 - time (sec): 13.08 - samples/sec: 1448.18 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:08:31,615 epoch 1 - iter 36/90 - loss 2.76830934 - time (sec): 17.23 - samples/sec: 1445.16 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:08:35,829 epoch 1 - iter 45/90 - loss 2.55828324 - time (sec): 21.44 - samples/sec: 1445.58 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:08:40,281 epoch 1 - iter 54/90 - loss 2.27911363 - time (sec): 25.90 - samples/sec: 1443.03 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:08:43,966 epoch 1 - iter 63/90 - loss 2.04649360 - time (sec): 29.58 - samples/sec: 1451.91 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:08:48,754 epoch 1 - iter 72/90 - loss 1.85199452 - time (sec): 34.37 - samples/sec: 1430.44 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:08:52,698 epoch 1 - iter 81/90 - loss 1.70122297 - time (sec): 38.31 - samples/sec: 1439.88 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:08:56,719 epoch 1 - iter 90/90 - loss 1.57972824 - time (sec): 42.33 - samples/sec: 1450.22 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:08:56,720 ----------------------------------------------------------------------------------------------------
2024-09-03 22:08:56,720 EPOCH 1 done: loss 1.5797 - lr: 0.000005
2024-09-03 22:08:58,183 DEV : loss 0.46851032972335815 - f1-score (micro avg) 0.0
2024-09-03 22:08:58,187 ----------------------------------------------------------------------------------------------------
2024-09-03 22:09:02,729 epoch 2 - iter 9/90 - loss 0.39452797 - time (sec): 4.54 - samples/sec: 1372.01 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:09:07,298 epoch 2 - iter 18/90 - loss 0.34943089 - time (sec): 9.11 - samples/sec: 1363.54 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:09:11,191 epoch 2 - iter 27/90 - loss 0.33048569 - time (sec): 13.00 - samples/sec: 1440.78 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:09:14,919 epoch 2 - iter 36/90 - loss 0.33742609 - time (sec): 16.73 - samples/sec: 1473.34 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:09:19,119 epoch 2 - iter 45/90 - loss 0.32836169 - time (sec): 20.93 - samples/sec: 1473.38 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:09:23,366 epoch 2 - iter 54/90 - loss 0.32386979 - time (sec): 25.18 - samples/sec: 1468.88 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:09:27,760 epoch 2 - iter 63/90 - loss 0.31846694 - time (sec): 29.57 - samples/sec: 1451.24 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:09:32,410 epoch 2 - iter 72/90 - loss 0.30906101 - time (sec): 34.22 - samples/sec: 1428.99 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:09:37,250 epoch 2 - iter 81/90 - loss 0.30259718 - time (sec): 39.06 - samples/sec: 1417.97 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:09:41,763 epoch 2 - iter 90/90 - loss 0.29697888 - time (sec): 43.57 - samples/sec: 1408.93 - lr: 0.000010 - momentum: 0.000000
2024-09-03 22:09:41,763 ----------------------------------------------------------------------------------------------------
2024-09-03 22:09:41,763 EPOCH 2 done: loss 0.2970 - lr: 0.000010
2024-09-03 22:09:43,313 DEV : loss 0.32844364643096924 - f1-score (micro avg) 0.4444
2024-09-03 22:09:43,318 saving best model
2024-09-03 22:09:44,704 ----------------------------------------------------------------------------------------------------
2024-09-03 22:09:48,810 epoch 3 - iter 9/90 - loss 0.18021120 - time (sec): 4.10 - samples/sec: 1451.01 - lr: 0.000010 - momentum: 0.000000
2024-09-03 22:09:52,605 epoch 3 - iter 18/90 - loss 0.18710233 - time (sec): 7.90 - samples/sec: 1551.86 - lr: 0.000010 - momentum: 0.000000
2024-09-03 22:09:57,165 epoch 3 - iter 27/90 - loss 0.19642900 - time (sec): 12.46 - samples/sec: 1444.72 - lr: 0.000010 - momentum: 0.000000
2024-09-03 22:10:01,325 epoch 3 - iter 36/90 - loss 0.19534522 - time (sec): 16.62 - samples/sec: 1462.08 - lr: 0.000010 - momentum: 0.000000
2024-09-03 22:10:05,793 epoch 3 - iter 45/90 - loss 0.19526541 - time (sec): 21.09 - samples/sec: 1449.57 - lr: 0.000010 - momentum: 0.000000
2024-09-03 22:10:10,319 epoch 3 - iter 54/90 - loss 0.18898441 - time (sec): 25.61 - samples/sec: 1435.75 - lr: 0.000010 - momentum: 0.000000
2024-09-03 22:10:15,005 epoch 3 - iter 63/90 - loss 0.18734274 - time (sec): 30.30 - samples/sec: 1425.75 - lr: 0.000010 - momentum: 0.000000
2024-09-03 22:10:19,484 epoch 3 - iter 72/90 - loss 0.19035033 - time (sec): 34.78 - samples/sec: 1425.46 - lr: 0.000010 - momentum: 0.000000
2024-09-03 22:10:23,866 epoch 3 - iter 81/90 - loss 0.18770810 - time (sec): 39.16 - samples/sec: 1422.17 - lr: 0.000010 - momentum: 0.000000
2024-09-03 22:10:27,920 epoch 3 - iter 90/90 - loss 0.18703645 - time (sec): 43.22 - samples/sec: 1420.65 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:10:27,920 ----------------------------------------------------------------------------------------------------
2024-09-03 22:10:27,921 EPOCH 3 done: loss 0.1870 - lr: 0.000009
2024-09-03 22:10:29,473 DEV : loss 0.23561017215251923 - f1-score (micro avg) 0.6408
2024-09-03 22:10:29,478 saving best model
2024-09-03 22:10:31,205 ----------------------------------------------------------------------------------------------------
2024-09-03 22:10:35,117 epoch 4 - iter 9/90 - loss 0.17246494 - time (sec): 3.91 - samples/sec: 1531.46 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:10:39,356 epoch 4 - iter 18/90 - loss 0.15134387 - time (sec): 8.15 - samples/sec: 1475.32 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:10:43,735 epoch 4 - iter 27/90 - loss 0.13746185 - time (sec): 12.53 - samples/sec: 1472.62 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:10:47,797 epoch 4 - iter 36/90 - loss 0.13417880 - time (sec): 16.59 - samples/sec: 1476.25 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:10:52,926 epoch 4 - iter 45/90 - loss 0.12898354 - time (sec): 21.72 - samples/sec: 1426.82 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:10:57,315 epoch 4 - iter 54/90 - loss 0.12655651 - time (sec): 26.11 - samples/sec: 1426.80 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:11:01,590 epoch 4 - iter 63/90 - loss 0.12590330 - time (sec): 30.38 - samples/sec: 1426.07 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:11:06,149 epoch 4 - iter 72/90 - loss 0.12273481 - time (sec): 34.94 - samples/sec: 1423.24 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:11:10,823 epoch 4 - iter 81/90 - loss 0.12123292 - time (sec): 39.62 - samples/sec: 1408.31 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:11:14,547 epoch 4 - iter 90/90 - loss 0.11969701 - time (sec): 43.34 - samples/sec: 1416.55 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:11:14,547 ----------------------------------------------------------------------------------------------------
2024-09-03 22:11:14,547 EPOCH 4 done: loss 0.1197 - lr: 0.000009
2024-09-03 22:11:16,092 DEV : loss 0.19088450074195862 - f1-score (micro avg) 0.7163
2024-09-03 22:11:16,096 saving best model
2024-09-03 22:11:17,838 ----------------------------------------------------------------------------------------------------
2024-09-03 22:11:22,005 epoch 5 - iter 9/90 - loss 0.07796958 - time (sec): 4.17 - samples/sec: 1427.37 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:11:26,793 epoch 5 - iter 18/90 - loss 0.09505982 - time (sec): 8.95 - samples/sec: 1347.83 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:11:30,524 epoch 5 - iter 27/90 - loss 0.09038601 - time (sec): 12.69 - samples/sec: 1419.11 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:11:34,896 epoch 5 - iter 36/90 - loss 0.08880525 - time (sec): 17.06 - samples/sec: 1408.49 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:11:39,545 epoch 5 - iter 45/90 - loss 0.08665835 - time (sec): 21.71 - samples/sec: 1396.96 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:11:43,572 epoch 5 - iter 54/90 - loss 0.08590365 - time (sec): 25.73 - samples/sec: 1416.95 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:11:47,888 epoch 5 - iter 63/90 - loss 0.08303076 - time (sec): 30.05 - samples/sec: 1426.28 - lr: 0.000009 - momentum: 0.000000
2024-09-03 22:11:52,321 epoch 5 - iter 72/90 - loss 0.08046962 - time (sec): 34.48 - samples/sec: 1433.50 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:11:56,489 epoch 5 - iter 81/90 - loss 0.07746895 - time (sec): 38.65 - samples/sec: 1432.76 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:12:00,972 epoch 5 - iter 90/90 - loss 0.07524486 - time (sec): 43.13 - samples/sec: 1423.36 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:12:00,972 ----------------------------------------------------------------------------------------------------
2024-09-03 22:12:00,972 EPOCH 5 done: loss 0.0752 - lr: 0.000008
2024-09-03 22:12:02,521 DEV : loss 0.1980796456336975 - f1-score (micro avg) 0.7235
2024-09-03 22:12:02,525 saving best model
2024-09-03 22:12:04,280 ----------------------------------------------------------------------------------------------------
2024-09-03 22:12:09,200 epoch 6 - iter 9/90 - loss 0.05086435 - time (sec): 4.92 - samples/sec: 1307.95 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:12:13,315 epoch 6 - iter 18/90 - loss 0.05627511 - time (sec): 9.03 - samples/sec: 1388.10 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:12:17,684 epoch 6 - iter 27/90 - loss 0.05226291 - time (sec): 13.40 - samples/sec: 1401.62 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:12:21,743 epoch 6 - iter 36/90 - loss 0.04931367 - time (sec): 17.46 - samples/sec: 1410.58 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:12:26,185 epoch 6 - iter 45/90 - loss 0.04709793 - time (sec): 21.90 - samples/sec: 1403.82 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:12:30,399 epoch 6 - iter 54/90 - loss 0.05018191 - time (sec): 26.12 - samples/sec: 1416.40 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:12:34,633 epoch 6 - iter 63/90 - loss 0.04949260 - time (sec): 30.35 - samples/sec: 1410.00 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:12:39,250 epoch 6 - iter 72/90 - loss 0.05163362 - time (sec): 34.97 - samples/sec: 1394.74 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:12:43,749 epoch 6 - iter 81/90 - loss 0.04998041 - time (sec): 39.47 - samples/sec: 1401.93 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:12:47,550 epoch 6 - iter 90/90 - loss 0.04991602 - time (sec): 43.27 - samples/sec: 1418.90 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:12:47,550 ----------------------------------------------------------------------------------------------------
2024-09-03 22:12:47,550 EPOCH 6 done: loss 0.0499 - lr: 0.000008
2024-09-03 22:12:49,108 DEV : loss 0.17487134039402008 - f1-score (micro avg) 0.7658
2024-09-03 22:12:49,113 saving best model
2024-09-03 22:12:50,858 ----------------------------------------------------------------------------------------------------
2024-09-03 22:12:54,868 epoch 7 - iter 9/90 - loss 0.02538037 - time (sec): 4.01 - samples/sec: 1477.46 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:12:59,226 epoch 7 - iter 18/90 - loss 0.03372476 - time (sec): 8.37 - samples/sec: 1442.49 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:13:03,761 epoch 7 - iter 27/90 - loss 0.03282378 - time (sec): 12.90 - samples/sec: 1432.08 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:13:07,554 epoch 7 - iter 36/90 - loss 0.03431105 - time (sec): 16.69 - samples/sec: 1446.04 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:13:12,006 epoch 7 - iter 45/90 - loss 0.03225077 - time (sec): 21.15 - samples/sec: 1452.28 - lr: 0.000008 - momentum: 0.000000
2024-09-03 22:13:16,824 epoch 7 - iter 54/90 - loss 0.03304733 - time (sec): 25.96 - samples/sec: 1419.88 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:13:21,125 epoch 7 - iter 63/90 - loss 0.03408886 - time (sec): 30.27 - samples/sec: 1428.43 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:13:25,202 epoch 7 - iter 72/90 - loss 0.03776201 - time (sec): 34.34 - samples/sec: 1434.95 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:13:29,135 epoch 7 - iter 81/90 - loss 0.03710028 - time (sec): 38.28 - samples/sec: 1448.37 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:13:33,099 epoch 7 - iter 90/90 - loss 0.03726568 - time (sec): 42.24 - samples/sec: 1453.46 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:13:33,099 ----------------------------------------------------------------------------------------------------
2024-09-03 22:13:33,099 EPOCH 7 done: loss 0.0373 - lr: 0.000007
2024-09-03 22:13:34,650 DEV : loss 0.18173354864120483 - f1-score (micro avg) 0.7627
2024-09-03 22:13:34,654 ----------------------------------------------------------------------------------------------------
2024-09-03 22:13:38,876 epoch 8 - iter 9/90 - loss 0.02092667 - time (sec): 4.22 - samples/sec: 1404.86 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:13:42,977 epoch 8 - iter 18/90 - loss 0.02438277 - time (sec): 8.32 - samples/sec: 1452.84 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:13:47,147 epoch 8 - iter 27/90 - loss 0.02262059 - time (sec): 12.49 - samples/sec: 1462.61 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:13:51,474 epoch 8 - iter 36/90 - loss 0.02522541 - time (sec): 16.82 - samples/sec: 1484.42 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:13:55,997 epoch 8 - iter 45/90 - loss 0.02481957 - time (sec): 21.34 - samples/sec: 1457.11 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:14:00,761 epoch 8 - iter 54/90 - loss 0.02359039 - time (sec): 26.11 - samples/sec: 1438.57 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:14:05,146 epoch 8 - iter 63/90 - loss 0.02355433 - time (sec): 30.49 - samples/sec: 1435.74 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:14:09,274 epoch 8 - iter 72/90 - loss 0.02419780 - time (sec): 34.62 - samples/sec: 1421.47 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:14:13,543 epoch 8 - iter 81/90 - loss 0.02375625 - time (sec): 38.89 - samples/sec: 1422.79 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:14:17,829 epoch 8 - iter 90/90 - loss 0.02349011 - time (sec): 43.17 - samples/sec: 1422.01 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:14:17,829 ----------------------------------------------------------------------------------------------------
2024-09-03 22:14:17,829 EPOCH 8 done: loss 0.0235 - lr: 0.000007
2024-09-03 22:14:19,380 DEV : loss 0.20718424022197723 - f1-score (micro avg) 0.7504
2024-09-03 22:14:19,384 ----------------------------------------------------------------------------------------------------
2024-09-03 22:14:23,485 epoch 9 - iter 9/90 - loss 0.02255550 - time (sec): 4.10 - samples/sec: 1531.81 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:14:27,762 epoch 9 - iter 18/90 - loss 0.01867817 - time (sec): 8.38 - samples/sec: 1470.40 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:14:32,041 epoch 9 - iter 27/90 - loss 0.01821711 - time (sec): 12.66 - samples/sec: 1460.17 - lr: 0.000007 - momentum: 0.000000
2024-09-03 22:14:36,587 epoch 9 - iter 36/90 - loss 0.01817603 - time (sec): 17.20 - samples/sec: 1439.66 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:14:40,717 epoch 9 - iter 45/90 - loss 0.01812448 - time (sec): 21.33 - samples/sec: 1437.53 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:14:44,676 epoch 9 - iter 54/90 - loss 0.01706299 - time (sec): 25.29 - samples/sec: 1452.34 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:14:49,248 epoch 9 - iter 63/90 - loss 0.01744359 - time (sec): 29.86 - samples/sec: 1440.48 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:14:53,395 epoch 9 - iter 72/90 - loss 0.01723729 - time (sec): 34.01 - samples/sec: 1439.52 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:14:58,196 epoch 9 - iter 81/90 - loss 0.01710673 - time (sec): 38.81 - samples/sec: 1422.73 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:15:02,430 epoch 9 - iter 90/90 - loss 0.01713626 - time (sec): 43.05 - samples/sec: 1426.26 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:15:02,431 ----------------------------------------------------------------------------------------------------
2024-09-03 22:15:02,431 EPOCH 9 done: loss 0.0171 - lr: 0.000006
2024-09-03 22:15:03,983 DEV : loss 0.18645833432674408 - f1-score (micro avg) 0.7696
2024-09-03 22:15:03,987 saving best model
2024-09-03 22:15:05,710 ----------------------------------------------------------------------------------------------------
2024-09-03 22:15:10,114 epoch 10 - iter 9/90 - loss 0.01487332 - time (sec): 4.40 - samples/sec: 1371.66 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:15:14,752 epoch 10 - iter 18/90 - loss 0.01588230 - time (sec): 9.04 - samples/sec: 1351.00 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:15:18,960 epoch 10 - iter 27/90 - loss 0.01385364 - time (sec): 13.25 - samples/sec: 1368.66 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:15:23,340 epoch 10 - iter 36/90 - loss 0.01439257 - time (sec): 17.63 - samples/sec: 1377.16 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:15:28,062 epoch 10 - iter 45/90 - loss 0.01401395 - time (sec): 22.35 - samples/sec: 1361.45 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:15:32,318 epoch 10 - iter 54/90 - loss 0.01346382 - time (sec): 26.61 - samples/sec: 1377.73 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:15:36,703 epoch 10 - iter 63/90 - loss 0.01382917 - time (sec): 30.99 - samples/sec: 1382.62 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:15:41,034 epoch 10 - iter 72/90 - loss 0.01399135 - time (sec): 35.32 - samples/sec: 1397.26 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:15:45,489 epoch 10 - iter 81/90 - loss 0.01359549 - time (sec): 39.78 - samples/sec: 1400.47 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:15:49,317 epoch 10 - iter 90/90 - loss 0.01459325 - time (sec): 43.60 - samples/sec: 1407.96 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:15:49,317 ----------------------------------------------------------------------------------------------------
2024-09-03 22:15:49,317 EPOCH 10 done: loss 0.0146 - lr: 0.000006
2024-09-03 22:15:50,868 DEV : loss 0.2126888781785965 - f1-score (micro avg) 0.7771
2024-09-03 22:15:50,872 saving best model
2024-09-03 22:15:52,597 ----------------------------------------------------------------------------------------------------
2024-09-03 22:15:56,519 epoch 11 - iter 9/90 - loss 0.01083532 - time (sec): 3.92 - samples/sec: 1537.40 - lr: 0.000006 - momentum: 0.000000
2024-09-03 22:16:00,719 epoch 11 - iter 18/90 - loss 0.01369081 - time (sec): 8.12 - samples/sec: 1519.59 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:16:04,964 epoch 11 - iter 27/90 - loss 0.01182479 - time (sec): 12.37 - samples/sec: 1520.67 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:16:09,825 epoch 11 - iter 36/90 - loss 0.01111172 - time (sec): 17.23 - samples/sec: 1467.37 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:16:14,205 epoch 11 - iter 45/90 - loss 0.00999335 - time (sec): 21.61 - samples/sec: 1456.64 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:16:18,570 epoch 11 - iter 54/90 - loss 0.00959961 - time (sec): 25.97 - samples/sec: 1445.48 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:16:23,295 epoch 11 - iter 63/90 - loss 0.00971446 - time (sec): 30.70 - samples/sec: 1422.94 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:16:27,451 epoch 11 - iter 72/90 - loss 0.00968067 - time (sec): 34.85 - samples/sec: 1427.69 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:16:31,509 epoch 11 - iter 81/90 - loss 0.00976322 - time (sec): 38.91 - samples/sec: 1433.25 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:16:35,289 epoch 11 - iter 90/90 - loss 0.01012837 - time (sec): 42.69 - samples/sec: 1438.13 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:16:35,289 ----------------------------------------------------------------------------------------------------
2024-09-03 22:16:35,289 EPOCH 11 done: loss 0.0101 - lr: 0.000005
2024-09-03 22:16:36,843 DEV : loss 0.23299568891525269 - f1-score (micro avg) 0.7622
2024-09-03 22:16:36,848 ----------------------------------------------------------------------------------------------------
2024-09-03 22:16:41,085 epoch 12 - iter 9/90 - loss 0.00751245 - time (sec): 4.24 - samples/sec: 1389.69 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:16:45,480 epoch 12 - iter 18/90 - loss 0.01017883 - time (sec): 8.63 - samples/sec: 1372.66 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:16:49,650 epoch 12 - iter 27/90 - loss 0.01071932 - time (sec): 12.80 - samples/sec: 1398.85 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:16:53,924 epoch 12 - iter 36/90 - loss 0.01041932 - time (sec): 17.08 - samples/sec: 1426.95 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:16:58,901 epoch 12 - iter 45/90 - loss 0.01116460 - time (sec): 22.05 - samples/sec: 1390.56 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:17:03,101 epoch 12 - iter 54/90 - loss 0.01020009 - time (sec): 26.25 - samples/sec: 1409.48 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:17:07,242 epoch 12 - iter 63/90 - loss 0.01021957 - time (sec): 30.39 - samples/sec: 1421.30 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:17:11,517 epoch 12 - iter 72/90 - loss 0.01010492 - time (sec): 34.67 - samples/sec: 1430.57 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:17:16,084 epoch 12 - iter 81/90 - loss 0.00985138 - time (sec): 39.24 - samples/sec: 1415.36 - lr: 0.000005 - momentum: 0.000000
2024-09-03 22:17:20,410 epoch 12 - iter 90/90 - loss 0.00982517 - time (sec): 43.56 - samples/sec: 1409.35 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:17:20,411 ----------------------------------------------------------------------------------------------------
2024-09-03 22:17:20,411 EPOCH 12 done: loss 0.0098 - lr: 0.000004
2024-09-03 22:17:21,963 DEV : loss 0.24243620038032532 - f1-score (micro avg) 0.777
2024-09-03 22:17:21,967 ----------------------------------------------------------------------------------------------------
2024-09-03 22:17:26,137 epoch 13 - iter 9/90 - loss 0.00363056 - time (sec): 4.17 - samples/sec: 1524.30 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:17:30,774 epoch 13 - iter 18/90 - loss 0.00653768 - time (sec): 8.81 - samples/sec: 1458.55 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:17:34,943 epoch 13 - iter 27/90 - loss 0.00636075 - time (sec): 12.98 - samples/sec: 1434.60 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:17:39,974 epoch 13 - iter 36/90 - loss 0.00665004 - time (sec): 18.01 - samples/sec: 1404.74 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:17:43,874 epoch 13 - iter 45/90 - loss 0.00650295 - time (sec): 21.91 - samples/sec: 1436.15 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:17:48,318 epoch 13 - iter 54/90 - loss 0.00639820 - time (sec): 26.35 - samples/sec: 1437.45 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:17:52,663 epoch 13 - iter 63/90 - loss 0.00598547 - time (sec): 30.70 - samples/sec: 1433.65 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:17:56,496 epoch 13 - iter 72/90 - loss 0.00643427 - time (sec): 34.53 - samples/sec: 1438.07 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:18:01,189 epoch 13 - iter 81/90 - loss 0.00685379 - time (sec): 39.22 - samples/sec: 1418.92 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:18:05,134 epoch 13 - iter 90/90 - loss 0.00733702 - time (sec): 43.17 - samples/sec: 1422.30 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:18:05,134 ----------------------------------------------------------------------------------------------------
2024-09-03 22:18:05,134 EPOCH 13 done: loss 0.0073 - lr: 0.000004
2024-09-03 22:18:06,691 DEV : loss 0.2638837397098541 - f1-score (micro avg) 0.7644
2024-09-03 22:18:06,695 ----------------------------------------------------------------------------------------------------
2024-09-03 22:18:11,175 epoch 14 - iter 9/90 - loss 0.00829397 - time (sec): 4.48 - samples/sec: 1385.83 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:18:15,704 epoch 14 - iter 18/90 - loss 0.00671472 - time (sec): 9.01 - samples/sec: 1377.56 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:18:20,260 epoch 14 - iter 27/90 - loss 0.00751253 - time (sec): 13.56 - samples/sec: 1386.83 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:18:24,624 epoch 14 - iter 36/90 - loss 0.00876323 - time (sec): 17.93 - samples/sec: 1412.73 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:18:28,316 epoch 14 - iter 45/90 - loss 0.00794374 - time (sec): 21.62 - samples/sec: 1441.03 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:18:32,587 epoch 14 - iter 54/90 - loss 0.00775956 - time (sec): 25.89 - samples/sec: 1439.29 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:18:37,191 epoch 14 - iter 63/90 - loss 0.00802070 - time (sec): 30.49 - samples/sec: 1430.57 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:18:41,731 epoch 14 - iter 72/90 - loss 0.00771433 - time (sec): 35.03 - samples/sec: 1419.19 - lr: 0.000004 - momentum: 0.000000
2024-09-03 22:18:46,062 epoch 14 - iter 81/90 - loss 0.00721965 - time (sec): 39.37 - samples/sec: 1415.81 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:18:50,041 epoch 14 - iter 90/90 - loss 0.00707296 - time (sec): 43.35 - samples/sec: 1416.38 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:18:50,042 ----------------------------------------------------------------------------------------------------
2024-09-03 22:18:50,042 EPOCH 14 done: loss 0.0071 - lr: 0.000003
2024-09-03 22:18:51,597 DEV : loss 0.2589911222457886 - f1-score (micro avg) 0.7741
2024-09-03 22:18:51,601 ----------------------------------------------------------------------------------------------------
2024-09-03 22:18:55,754 epoch 15 - iter 9/90 - loss 0.00633966 - time (sec): 4.15 - samples/sec: 1415.98 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:18:59,824 epoch 15 - iter 18/90 - loss 0.00596012 - time (sec): 8.22 - samples/sec: 1455.43 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:19:04,258 epoch 15 - iter 27/90 - loss 0.00623353 - time (sec): 12.66 - samples/sec: 1417.82 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:19:09,005 epoch 15 - iter 36/90 - loss 0.00527114 - time (sec): 17.40 - samples/sec: 1380.87 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:19:13,130 epoch 15 - iter 45/90 - loss 0.00521481 - time (sec): 21.53 - samples/sec: 1422.56 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:19:17,904 epoch 15 - iter 54/90 - loss 0.00494592 - time (sec): 26.30 - samples/sec: 1397.10 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:19:22,027 epoch 15 - iter 63/90 - loss 0.00472614 - time (sec): 30.43 - samples/sec: 1409.77 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:19:26,657 epoch 15 - iter 72/90 - loss 0.00487358 - time (sec): 35.05 - samples/sec: 1405.03 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:19:30,886 epoch 15 - iter 81/90 - loss 0.00551019 - time (sec): 39.28 - samples/sec: 1412.28 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:19:34,800 epoch 15 - iter 90/90 - loss 0.00570350 - time (sec): 43.20 - samples/sec: 1421.22 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:19:34,800 ----------------------------------------------------------------------------------------------------
2024-09-03 22:19:34,800 EPOCH 15 done: loss 0.0057 - lr: 0.000003
2024-09-03 22:19:36,357 DEV : loss 0.27159348130226135 - f1-score (micro avg) 0.7665
2024-09-03 22:19:36,361 ----------------------------------------------------------------------------------------------------
2024-09-03 22:19:40,741 epoch 16 - iter 9/90 - loss 0.00459489 - time (sec): 4.38 - samples/sec: 1442.57 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:19:44,607 epoch 16 - iter 18/90 - loss 0.00618470 - time (sec): 8.25 - samples/sec: 1493.38 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:19:49,028 epoch 16 - iter 27/90 - loss 0.00510669 - time (sec): 12.67 - samples/sec: 1444.92 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:19:53,348 epoch 16 - iter 36/90 - loss 0.00570184 - time (sec): 16.99 - samples/sec: 1447.83 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:19:57,316 epoch 16 - iter 45/90 - loss 0.00552000 - time (sec): 20.95 - samples/sec: 1469.67 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:20:01,570 epoch 16 - iter 54/90 - loss 0.00581402 - time (sec): 25.21 - samples/sec: 1468.82 - lr: 0.000003 - momentum: 0.000000
2024-09-03 22:20:06,247 epoch 16 - iter 63/90 - loss 0.00559269 - time (sec): 29.89 - samples/sec: 1446.70 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:20:10,585 epoch 16 - iter 72/90 - loss 0.00515513 - time (sec): 34.22 - samples/sec: 1442.34 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:20:15,449 epoch 16 - iter 81/90 - loss 0.00474433 - time (sec): 39.09 - samples/sec: 1420.96 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:20:19,518 epoch 16 - iter 90/90 - loss 0.00474793 - time (sec): 43.16 - samples/sec: 1422.58 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:20:19,519 ----------------------------------------------------------------------------------------------------
2024-09-03 22:20:19,519 EPOCH 16 done: loss 0.0047 - lr: 0.000002
2024-09-03 22:20:21,072 DEV : loss 0.2844613194465637 - f1-score (micro avg) 0.7658
2024-09-03 22:20:21,077 ----------------------------------------------------------------------------------------------------
2024-09-03 22:20:25,936 epoch 17 - iter 9/90 - loss 0.00543864 - time (sec): 4.86 - samples/sec: 1376.13 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:20:30,131 epoch 17 - iter 18/90 - loss 0.00405570 - time (sec): 9.05 - samples/sec: 1424.00 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:20:34,391 epoch 17 - iter 27/90 - loss 0.00364881 - time (sec): 13.31 - samples/sec: 1433.67 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:20:38,389 epoch 17 - iter 36/90 - loss 0.00321257 - time (sec): 17.31 - samples/sec: 1454.74 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:20:43,237 epoch 17 - iter 45/90 - loss 0.00351433 - time (sec): 22.16 - samples/sec: 1419.70 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:20:47,371 epoch 17 - iter 54/90 - loss 0.00378463 - time (sec): 26.29 - samples/sec: 1428.58 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:20:51,524 epoch 17 - iter 63/90 - loss 0.00363362 - time (sec): 30.45 - samples/sec: 1431.58 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:20:55,811 epoch 17 - iter 72/90 - loss 0.00368783 - time (sec): 34.73 - samples/sec: 1430.27 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:20:59,926 epoch 17 - iter 81/90 - loss 0.00365053 - time (sec): 38.85 - samples/sec: 1431.33 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:21:03,944 epoch 17 - iter 90/90 - loss 0.00348700 - time (sec): 42.87 - samples/sec: 1432.22 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:21:03,944 ----------------------------------------------------------------------------------------------------
2024-09-03 22:21:03,944 EPOCH 17 done: loss 0.0035 - lr: 0.000002
2024-09-03 22:21:05,496 DEV : loss 0.2972029745578766 - f1-score (micro avg) 0.773
2024-09-03 22:21:05,500 ----------------------------------------------------------------------------------------------------
2024-09-03 22:21:10,081 epoch 18 - iter 9/90 - loss 0.00473264 - time (sec): 4.58 - samples/sec: 1372.21 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:21:14,856 epoch 18 - iter 18/90 - loss 0.00333959 - time (sec): 9.35 - samples/sec: 1333.79 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:21:19,144 epoch 18 - iter 27/90 - loss 0.00386776 - time (sec): 13.64 - samples/sec: 1362.51 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:21:23,545 epoch 18 - iter 36/90 - loss 0.00317074 - time (sec): 18.04 - samples/sec: 1375.56 - lr: 0.000002 - momentum: 0.000000
2024-09-03 22:21:27,632 epoch 18 - iter 45/90 - loss 0.00345585 - time (sec): 22.13 - samples/sec: 1401.58 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:21:31,759 epoch 18 - iter 54/90 - loss 0.00323514 - time (sec): 26.26 - samples/sec: 1406.92 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:21:36,213 epoch 18 - iter 63/90 - loss 0.00309887 - time (sec): 30.71 - samples/sec: 1406.77 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:21:40,796 epoch 18 - iter 72/90 - loss 0.00289858 - time (sec): 35.29 - samples/sec: 1405.24 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:21:44,868 epoch 18 - iter 81/90 - loss 0.00310884 - time (sec): 39.37 - samples/sec: 1418.45 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:21:48,652 epoch 18 - iter 90/90 - loss 0.00345453 - time (sec): 43.15 - samples/sec: 1422.77 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:21:48,653 ----------------------------------------------------------------------------------------------------
2024-09-03 22:21:48,653 EPOCH 18 done: loss 0.0035 - lr: 0.000001
2024-09-03 22:21:50,219 DEV : loss 0.3015914261341095 - f1-score (micro avg) 0.7741
2024-09-03 22:21:50,224 ----------------------------------------------------------------------------------------------------
2024-09-03 22:21:54,090 epoch 19 - iter 9/90 - loss 0.00418814 - time (sec): 3.87 - samples/sec: 1480.24 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:21:58,821 epoch 19 - iter 18/90 - loss 0.00225645 - time (sec): 8.60 - samples/sec: 1405.29 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:22:03,440 epoch 19 - iter 27/90 - loss 0.00205094 - time (sec): 13.22 - samples/sec: 1403.71 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:22:07,670 epoch 19 - iter 36/90 - loss 0.00184944 - time (sec): 17.45 - samples/sec: 1422.03 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:22:12,181 epoch 19 - iter 45/90 - loss 0.00180956 - time (sec): 21.96 - samples/sec: 1410.69 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:22:16,249 epoch 19 - iter 54/90 - loss 0.00175773 - time (sec): 26.02 - samples/sec: 1414.19 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:22:20,450 epoch 19 - iter 63/90 - loss 0.00194382 - time (sec): 30.23 - samples/sec: 1418.41 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:22:24,865 epoch 19 - iter 72/90 - loss 0.00192499 - time (sec): 34.64 - samples/sec: 1422.31 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:22:29,214 epoch 19 - iter 81/90 - loss 0.00199577 - time (sec): 38.99 - samples/sec: 1419.45 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:22:33,476 epoch 19 - iter 90/90 - loss 0.00197310 - time (sec): 43.25 - samples/sec: 1419.48 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:22:33,476 ----------------------------------------------------------------------------------------------------
2024-09-03 22:22:33,476 EPOCH 19 done: loss 0.0020 - lr: 0.000001
2024-09-03 22:22:35,029 DEV : loss 0.3135406970977783 - f1-score (micro avg) 0.7702
2024-09-03 22:22:35,033 ----------------------------------------------------------------------------------------------------
2024-09-03 22:22:39,086 epoch 20 - iter 9/90 - loss 0.00362040 - time (sec): 4.05 - samples/sec: 1462.60 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:22:43,561 epoch 20 - iter 18/90 - loss 0.00299681 - time (sec): 8.53 - samples/sec: 1423.89 - lr: 0.000001 - momentum: 0.000000
2024-09-03 22:22:47,733 epoch 20 - iter 27/90 - loss 0.00273788 - time (sec): 12.70 - samples/sec: 1465.52 - lr: 0.000000 - momentum: 0.000000
2024-09-03 22:22:51,996 epoch 20 - iter 36/90 - loss 0.00258476 - time (sec): 16.96 - samples/sec: 1454.16 - lr: 0.000000 - momentum: 0.000000
2024-09-03 22:22:56,328 epoch 20 - iter 45/90 - loss 0.00252421 - time (sec): 21.29 - samples/sec: 1447.19 - lr: 0.000000 - momentum: 0.000000
2024-09-03 22:23:00,763 epoch 20 - iter 54/90 - loss 0.00236808 - time (sec): 25.73 - samples/sec: 1445.51 - lr: 0.000000 - momentum: 0.000000
2024-09-03 22:23:05,247 epoch 20 - iter 63/90 - loss 0.00220282 - time (sec): 30.21 - samples/sec: 1433.87 - lr: 0.000000 - momentum: 0.000000
2024-09-03 22:23:10,243 epoch 20 - iter 72/90 - loss 0.00214946 - time (sec): 35.21 - samples/sec: 1412.84 - lr: 0.000000 - momentum: 0.000000
2024-09-03 22:23:14,194 epoch 20 - iter 81/90 - loss 0.00246307 - time (sec): 39.16 - samples/sec: 1418.42 - lr: 0.000000 - momentum: 0.000000
2024-09-03 22:23:18,525 epoch 20 - iter 90/90 - loss 0.00240450 - time (sec): 43.49 - samples/sec: 1411.64 - lr: 0.000000 - momentum: 0.000000
2024-09-03 22:23:18,526 ----------------------------------------------------------------------------------------------------
2024-09-03 22:23:18,526 EPOCH 20 done: loss 0.0024 - lr: 0.000000
2024-09-03 22:23:20,078 DEV : loss 0.3147675693035126 - f1-score (micro avg) 0.7702
2024-09-03 22:23:21,239 ----------------------------------------------------------------------------------------------------
2024-09-03 22:23:21,240 Loading model from best epoch ...
2024-09-03 22:23:25,153 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-MISC, B-MISC, E-MISC, I-MISC, S-ORG, B-ORG, E-ORG, I-ORG
2024-09-03 22:23:26,698
Results:
- F-score (micro) 0.7297
- F-score (macro) 0.6833
- Accuracy 0.6119
By class:
precision recall f1-score support
ORG 0.7266 0.7949 0.7592 117
PER 0.8158 0.9538 0.8794 65
LOC 0.7231 0.7581 0.7402 62
MISC 0.5600 0.2593 0.3544 54
micro avg 0.7347 0.7248 0.7297 298
macro avg 0.7064 0.6915 0.6833 298
weighted avg 0.7151 0.7248 0.7081 298
2024-09-03 22:23:26,698 ----------------------------------------------------------------------------------------------------