Upload ./training.log with huggingface_hub
Browse files- training.log +242 -0
training.log
ADDED
@@ -0,0 +1,242 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-25 21:32:57,504 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-25 21:32:57,504 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(64001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=17, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-25 21:32:57,505 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-25 21:32:57,505 MultiCorpus: 1166 train + 165 dev + 415 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
|
53 |
+
2023-10-25 21:32:57,505 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-25 21:32:57,505 Train: 1166 sentences
|
55 |
+
2023-10-25 21:32:57,505 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-25 21:32:57,505 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-25 21:32:57,505 Training Params:
|
58 |
+
2023-10-25 21:32:57,505 - learning_rate: "5e-05"
|
59 |
+
2023-10-25 21:32:57,505 - mini_batch_size: "4"
|
60 |
+
2023-10-25 21:32:57,505 - max_epochs: "10"
|
61 |
+
2023-10-25 21:32:57,505 - shuffle: "True"
|
62 |
+
2023-10-25 21:32:57,505 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-25 21:32:57,505 Plugins:
|
64 |
+
2023-10-25 21:32:57,505 - TensorboardLogger
|
65 |
+
2023-10-25 21:32:57,505 - LinearScheduler | warmup_fraction: '0.1'
|
66 |
+
2023-10-25 21:32:57,505 ----------------------------------------------------------------------------------------------------
|
67 |
+
2023-10-25 21:32:57,505 Final evaluation on model from best epoch (best-model.pt)
|
68 |
+
2023-10-25 21:32:57,505 - metric: "('micro avg', 'f1-score')"
|
69 |
+
2023-10-25 21:32:57,505 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-25 21:32:57,505 Computation:
|
71 |
+
2023-10-25 21:32:57,505 - compute on device: cuda:0
|
72 |
+
2023-10-25 21:32:57,505 - embedding storage: none
|
73 |
+
2023-10-25 21:32:57,505 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-25 21:32:57,505 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
|
75 |
+
2023-10-25 21:32:57,505 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-25 21:32:57,505 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-25 21:32:57,505 Logging anything other than scalars to TensorBoard is currently not supported.
|
78 |
+
2023-10-25 21:32:58,806 epoch 1 - iter 29/292 - loss 2.45137787 - time (sec): 1.30 - samples/sec: 3120.17 - lr: 0.000005 - momentum: 0.000000
|
79 |
+
2023-10-25 21:33:00,122 epoch 1 - iter 58/292 - loss 1.70444850 - time (sec): 2.62 - samples/sec: 2993.17 - lr: 0.000010 - momentum: 0.000000
|
80 |
+
2023-10-25 21:33:01,467 epoch 1 - iter 87/292 - loss 1.36569673 - time (sec): 3.96 - samples/sec: 3159.08 - lr: 0.000015 - momentum: 0.000000
|
81 |
+
2023-10-25 21:33:02,798 epoch 1 - iter 116/292 - loss 1.13542738 - time (sec): 5.29 - samples/sec: 3233.59 - lr: 0.000020 - momentum: 0.000000
|
82 |
+
2023-10-25 21:33:04,065 epoch 1 - iter 145/292 - loss 0.95947046 - time (sec): 6.56 - samples/sec: 3311.90 - lr: 0.000025 - momentum: 0.000000
|
83 |
+
2023-10-25 21:33:05,349 epoch 1 - iter 174/292 - loss 0.86059002 - time (sec): 7.84 - samples/sec: 3267.81 - lr: 0.000030 - momentum: 0.000000
|
84 |
+
2023-10-25 21:33:06,715 epoch 1 - iter 203/292 - loss 0.75774356 - time (sec): 9.21 - samples/sec: 3338.69 - lr: 0.000035 - momentum: 0.000000
|
85 |
+
2023-10-25 21:33:08,059 epoch 1 - iter 232/292 - loss 0.68013281 - time (sec): 10.55 - samples/sec: 3389.51 - lr: 0.000040 - momentum: 0.000000
|
86 |
+
2023-10-25 21:33:09,369 epoch 1 - iter 261/292 - loss 0.63213092 - time (sec): 11.86 - samples/sec: 3397.16 - lr: 0.000045 - momentum: 0.000000
|
87 |
+
2023-10-25 21:33:10,649 epoch 1 - iter 290/292 - loss 0.59913023 - time (sec): 13.14 - samples/sec: 3363.80 - lr: 0.000049 - momentum: 0.000000
|
88 |
+
2023-10-25 21:33:10,733 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-25 21:33:10,733 EPOCH 1 done: loss 0.5985 - lr: 0.000049
|
90 |
+
2023-10-25 21:33:11,405 DEV : loss 0.12858878076076508 - f1-score (micro avg) 0.6058
|
91 |
+
2023-10-25 21:33:11,409 saving best model
|
92 |
+
2023-10-25 21:33:11,920 ----------------------------------------------------------------------------------------------------
|
93 |
+
2023-10-25 21:33:13,209 epoch 2 - iter 29/292 - loss 0.15207670 - time (sec): 1.29 - samples/sec: 3462.10 - lr: 0.000049 - momentum: 0.000000
|
94 |
+
2023-10-25 21:33:14,531 epoch 2 - iter 58/292 - loss 0.17254937 - time (sec): 2.61 - samples/sec: 3369.96 - lr: 0.000049 - momentum: 0.000000
|
95 |
+
2023-10-25 21:33:15,837 epoch 2 - iter 87/292 - loss 0.16515785 - time (sec): 3.92 - samples/sec: 3320.56 - lr: 0.000048 - momentum: 0.000000
|
96 |
+
2023-10-25 21:33:17,171 epoch 2 - iter 116/292 - loss 0.15565784 - time (sec): 5.25 - samples/sec: 3367.25 - lr: 0.000048 - momentum: 0.000000
|
97 |
+
2023-10-25 21:33:18,426 epoch 2 - iter 145/292 - loss 0.15034919 - time (sec): 6.51 - samples/sec: 3346.07 - lr: 0.000047 - momentum: 0.000000
|
98 |
+
2023-10-25 21:33:19,663 epoch 2 - iter 174/292 - loss 0.15226906 - time (sec): 7.74 - samples/sec: 3387.64 - lr: 0.000047 - momentum: 0.000000
|
99 |
+
2023-10-25 21:33:20,920 epoch 2 - iter 203/292 - loss 0.15082330 - time (sec): 9.00 - samples/sec: 3375.79 - lr: 0.000046 - momentum: 0.000000
|
100 |
+
2023-10-25 21:33:22,195 epoch 2 - iter 232/292 - loss 0.15194885 - time (sec): 10.27 - samples/sec: 3383.93 - lr: 0.000046 - momentum: 0.000000
|
101 |
+
2023-10-25 21:33:23,556 epoch 2 - iter 261/292 - loss 0.15456539 - time (sec): 11.63 - samples/sec: 3380.94 - lr: 0.000045 - momentum: 0.000000
|
102 |
+
2023-10-25 21:33:24,901 epoch 2 - iter 290/292 - loss 0.14927010 - time (sec): 12.98 - samples/sec: 3393.23 - lr: 0.000045 - momentum: 0.000000
|
103 |
+
2023-10-25 21:33:24,981 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-25 21:33:24,981 EPOCH 2 done: loss 0.1483 - lr: 0.000045
|
105 |
+
2023-10-25 21:33:25,896 DEV : loss 0.1477406919002533 - f1-score (micro avg) 0.6391
|
106 |
+
2023-10-25 21:33:25,900 saving best model
|
107 |
+
2023-10-25 21:33:26,570 ----------------------------------------------------------------------------------------------------
|
108 |
+
2023-10-25 21:33:28,092 epoch 3 - iter 29/292 - loss 0.09222681 - time (sec): 1.52 - samples/sec: 3853.44 - lr: 0.000044 - momentum: 0.000000
|
109 |
+
2023-10-25 21:33:29,401 epoch 3 - iter 58/292 - loss 0.09581570 - time (sec): 2.83 - samples/sec: 3688.13 - lr: 0.000043 - momentum: 0.000000
|
110 |
+
2023-10-25 21:33:30,640 epoch 3 - iter 87/292 - loss 0.09699714 - time (sec): 4.07 - samples/sec: 3597.96 - lr: 0.000043 - momentum: 0.000000
|
111 |
+
2023-10-25 21:33:31,923 epoch 3 - iter 116/292 - loss 0.09432914 - time (sec): 5.35 - samples/sec: 3493.39 - lr: 0.000042 - momentum: 0.000000
|
112 |
+
2023-10-25 21:33:33,252 epoch 3 - iter 145/292 - loss 0.09100948 - time (sec): 6.68 - samples/sec: 3462.50 - lr: 0.000042 - momentum: 0.000000
|
113 |
+
2023-10-25 21:33:34,541 epoch 3 - iter 174/292 - loss 0.08813279 - time (sec): 7.97 - samples/sec: 3403.92 - lr: 0.000041 - momentum: 0.000000
|
114 |
+
2023-10-25 21:33:35,854 epoch 3 - iter 203/292 - loss 0.08542829 - time (sec): 9.28 - samples/sec: 3409.01 - lr: 0.000041 - momentum: 0.000000
|
115 |
+
2023-10-25 21:33:37,116 epoch 3 - iter 232/292 - loss 0.08503915 - time (sec): 10.54 - samples/sec: 3326.93 - lr: 0.000040 - momentum: 0.000000
|
116 |
+
2023-10-25 21:33:38,435 epoch 3 - iter 261/292 - loss 0.08523882 - time (sec): 11.86 - samples/sec: 3379.59 - lr: 0.000040 - momentum: 0.000000
|
117 |
+
2023-10-25 21:33:39,681 epoch 3 - iter 290/292 - loss 0.08483685 - time (sec): 13.11 - samples/sec: 3373.62 - lr: 0.000039 - momentum: 0.000000
|
118 |
+
2023-10-25 21:33:39,767 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-25 21:33:39,768 EPOCH 3 done: loss 0.0858 - lr: 0.000039
|
120 |
+
2023-10-25 21:33:40,839 DEV : loss 0.1286671906709671 - f1-score (micro avg) 0.7152
|
121 |
+
2023-10-25 21:33:40,843 saving best model
|
122 |
+
2023-10-25 21:33:41,512 ----------------------------------------------------------------------------------------------------
|
123 |
+
2023-10-25 21:33:42,842 epoch 4 - iter 29/292 - loss 0.06579822 - time (sec): 1.33 - samples/sec: 3197.68 - lr: 0.000038 - momentum: 0.000000
|
124 |
+
2023-10-25 21:33:44,181 epoch 4 - iter 58/292 - loss 0.05383181 - time (sec): 2.67 - samples/sec: 3158.85 - lr: 0.000038 - momentum: 0.000000
|
125 |
+
2023-10-25 21:33:45,500 epoch 4 - iter 87/292 - loss 0.04876507 - time (sec): 3.98 - samples/sec: 3274.58 - lr: 0.000037 - momentum: 0.000000
|
126 |
+
2023-10-25 21:33:46,811 epoch 4 - iter 116/292 - loss 0.04823903 - time (sec): 5.29 - samples/sec: 3210.16 - lr: 0.000037 - momentum: 0.000000
|
127 |
+
2023-10-25 21:33:48,194 epoch 4 - iter 145/292 - loss 0.05376063 - time (sec): 6.68 - samples/sec: 3382.61 - lr: 0.000036 - momentum: 0.000000
|
128 |
+
2023-10-25 21:33:49,494 epoch 4 - iter 174/292 - loss 0.05346591 - time (sec): 7.98 - samples/sec: 3433.92 - lr: 0.000036 - momentum: 0.000000
|
129 |
+
2023-10-25 21:33:50,743 epoch 4 - iter 203/292 - loss 0.05630322 - time (sec): 9.23 - samples/sec: 3443.19 - lr: 0.000035 - momentum: 0.000000
|
130 |
+
2023-10-25 21:33:52,047 epoch 4 - iter 232/292 - loss 0.05797129 - time (sec): 10.53 - samples/sec: 3396.68 - lr: 0.000035 - momentum: 0.000000
|
131 |
+
2023-10-25 21:33:53,474 epoch 4 - iter 261/292 - loss 0.05843650 - time (sec): 11.96 - samples/sec: 3381.69 - lr: 0.000034 - momentum: 0.000000
|
132 |
+
2023-10-25 21:33:54,735 epoch 4 - iter 290/292 - loss 0.05750520 - time (sec): 13.22 - samples/sec: 3343.54 - lr: 0.000033 - momentum: 0.000000
|
133 |
+
2023-10-25 21:33:54,814 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-25 21:33:54,814 EPOCH 4 done: loss 0.0572 - lr: 0.000033
|
135 |
+
2023-10-25 21:33:55,722 DEV : loss 0.1722760796546936 - f1-score (micro avg) 0.7025
|
136 |
+
2023-10-25 21:33:55,726 ----------------------------------------------------------------------------------------------------
|
137 |
+
2023-10-25 21:33:56,988 epoch 5 - iter 29/292 - loss 0.04342491 - time (sec): 1.26 - samples/sec: 3606.54 - lr: 0.000033 - momentum: 0.000000
|
138 |
+
2023-10-25 21:33:58,245 epoch 5 - iter 58/292 - loss 0.04319897 - time (sec): 2.52 - samples/sec: 3469.22 - lr: 0.000032 - momentum: 0.000000
|
139 |
+
2023-10-25 21:33:59,571 epoch 5 - iter 87/292 - loss 0.03827778 - time (sec): 3.84 - samples/sec: 3389.04 - lr: 0.000032 - momentum: 0.000000
|
140 |
+
2023-10-25 21:34:00,840 epoch 5 - iter 116/292 - loss 0.03386059 - time (sec): 5.11 - samples/sec: 3404.01 - lr: 0.000031 - momentum: 0.000000
|
141 |
+
2023-10-25 21:34:02,129 epoch 5 - iter 145/292 - loss 0.03593760 - time (sec): 6.40 - samples/sec: 3414.83 - lr: 0.000031 - momentum: 0.000000
|
142 |
+
2023-10-25 21:34:03,401 epoch 5 - iter 174/292 - loss 0.03846183 - time (sec): 7.67 - samples/sec: 3360.64 - lr: 0.000030 - momentum: 0.000000
|
143 |
+
2023-10-25 21:34:04,755 epoch 5 - iter 203/292 - loss 0.03960139 - time (sec): 9.03 - samples/sec: 3365.46 - lr: 0.000030 - momentum: 0.000000
|
144 |
+
2023-10-25 21:34:06,006 epoch 5 - iter 232/292 - loss 0.03978996 - time (sec): 10.28 - samples/sec: 3462.51 - lr: 0.000029 - momentum: 0.000000
|
145 |
+
2023-10-25 21:34:07,231 epoch 5 - iter 261/292 - loss 0.04007028 - time (sec): 11.50 - samples/sec: 3481.69 - lr: 0.000028 - momentum: 0.000000
|
146 |
+
2023-10-25 21:34:08,469 epoch 5 - iter 290/292 - loss 0.03905369 - time (sec): 12.74 - samples/sec: 3476.05 - lr: 0.000028 - momentum: 0.000000
|
147 |
+
2023-10-25 21:34:08,549 ----------------------------------------------------------------------------------------------------
|
148 |
+
2023-10-25 21:34:08,550 EPOCH 5 done: loss 0.0390 - lr: 0.000028
|
149 |
+
2023-10-25 21:34:09,457 DEV : loss 0.16055038571357727 - f1-score (micro avg) 0.6835
|
150 |
+
2023-10-25 21:34:09,462 ----------------------------------------------------------------------------------------------------
|
151 |
+
2023-10-25 21:34:10,800 epoch 6 - iter 29/292 - loss 0.03141562 - time (sec): 1.34 - samples/sec: 3701.70 - lr: 0.000027 - momentum: 0.000000
|
152 |
+
2023-10-25 21:34:12,095 epoch 6 - iter 58/292 - loss 0.03968166 - time (sec): 2.63 - samples/sec: 3404.87 - lr: 0.000027 - momentum: 0.000000
|
153 |
+
2023-10-25 21:34:13,387 epoch 6 - iter 87/292 - loss 0.03196566 - time (sec): 3.92 - samples/sec: 3450.89 - lr: 0.000026 - momentum: 0.000000
|
154 |
+
2023-10-25 21:34:14,722 epoch 6 - iter 116/292 - loss 0.03582229 - time (sec): 5.26 - samples/sec: 3465.57 - lr: 0.000026 - momentum: 0.000000
|
155 |
+
2023-10-25 21:34:15,994 epoch 6 - iter 145/292 - loss 0.03410517 - time (sec): 6.53 - samples/sec: 3474.12 - lr: 0.000025 - momentum: 0.000000
|
156 |
+
2023-10-25 21:34:17,304 epoch 6 - iter 174/292 - loss 0.03262458 - time (sec): 7.84 - samples/sec: 3455.30 - lr: 0.000025 - momentum: 0.000000
|
157 |
+
2023-10-25 21:34:18,567 epoch 6 - iter 203/292 - loss 0.03056044 - time (sec): 9.10 - samples/sec: 3418.19 - lr: 0.000024 - momentum: 0.000000
|
158 |
+
2023-10-25 21:34:19,863 epoch 6 - iter 232/292 - loss 0.03001793 - time (sec): 10.40 - samples/sec: 3390.09 - lr: 0.000023 - momentum: 0.000000
|
159 |
+
2023-10-25 21:34:21,193 epoch 6 - iter 261/292 - loss 0.03043669 - time (sec): 11.73 - samples/sec: 3399.43 - lr: 0.000023 - momentum: 0.000000
|
160 |
+
2023-10-25 21:34:22,499 epoch 6 - iter 290/292 - loss 0.03080981 - time (sec): 13.04 - samples/sec: 3371.72 - lr: 0.000022 - momentum: 0.000000
|
161 |
+
2023-10-25 21:34:22,589 ----------------------------------------------------------------------------------------------------
|
162 |
+
2023-10-25 21:34:22,589 EPOCH 6 done: loss 0.0306 - lr: 0.000022
|
163 |
+
2023-10-25 21:34:23,501 DEV : loss 0.19378620386123657 - f1-score (micro avg) 0.7133
|
164 |
+
2023-10-25 21:34:23,506 ----------------------------------------------------------------------------------------------------
|
165 |
+
2023-10-25 21:34:24,816 epoch 7 - iter 29/292 - loss 0.02249619 - time (sec): 1.31 - samples/sec: 3713.89 - lr: 0.000022 - momentum: 0.000000
|
166 |
+
2023-10-25 21:34:26,141 epoch 7 - iter 58/292 - loss 0.03051413 - time (sec): 2.63 - samples/sec: 3714.66 - lr: 0.000021 - momentum: 0.000000
|
167 |
+
2023-10-25 21:34:27,389 epoch 7 - iter 87/292 - loss 0.03401934 - time (sec): 3.88 - samples/sec: 3562.14 - lr: 0.000021 - momentum: 0.000000
|
168 |
+
2023-10-25 21:34:28,671 epoch 7 - iter 116/292 - loss 0.03101955 - time (sec): 5.16 - samples/sec: 3443.23 - lr: 0.000020 - momentum: 0.000000
|
169 |
+
2023-10-25 21:34:29,951 epoch 7 - iter 145/292 - loss 0.02682969 - time (sec): 6.44 - samples/sec: 3368.08 - lr: 0.000020 - momentum: 0.000000
|
170 |
+
2023-10-25 21:34:31,346 epoch 7 - iter 174/292 - loss 0.02528320 - time (sec): 7.84 - samples/sec: 3392.08 - lr: 0.000019 - momentum: 0.000000
|
171 |
+
2023-10-25 21:34:32,684 epoch 7 - iter 203/292 - loss 0.02397947 - time (sec): 9.18 - samples/sec: 3398.50 - lr: 0.000018 - momentum: 0.000000
|
172 |
+
2023-10-25 21:34:33,987 epoch 7 - iter 232/292 - loss 0.02354003 - time (sec): 10.48 - samples/sec: 3367.45 - lr: 0.000018 - momentum: 0.000000
|
173 |
+
2023-10-25 21:34:35,297 epoch 7 - iter 261/292 - loss 0.02151238 - time (sec): 11.79 - samples/sec: 3361.83 - lr: 0.000017 - momentum: 0.000000
|
174 |
+
2023-10-25 21:34:36,578 epoch 7 - iter 290/292 - loss 0.02098365 - time (sec): 13.07 - samples/sec: 3389.04 - lr: 0.000017 - momentum: 0.000000
|
175 |
+
2023-10-25 21:34:36,653 ----------------------------------------------------------------------------------------------------
|
176 |
+
2023-10-25 21:34:36,654 EPOCH 7 done: loss 0.0209 - lr: 0.000017
|
177 |
+
2023-10-25 21:34:37,749 DEV : loss 0.1828424036502838 - f1-score (micro avg) 0.7832
|
178 |
+
2023-10-25 21:34:37,754 saving best model
|
179 |
+
2023-10-25 21:34:38,425 ----------------------------------------------------------------------------------------------------
|
180 |
+
2023-10-25 21:34:39,867 epoch 8 - iter 29/292 - loss 0.02744473 - time (sec): 1.44 - samples/sec: 3042.64 - lr: 0.000016 - momentum: 0.000000
|
181 |
+
2023-10-25 21:34:41,255 epoch 8 - iter 58/292 - loss 0.02406253 - time (sec): 2.83 - samples/sec: 3124.97 - lr: 0.000016 - momentum: 0.000000
|
182 |
+
2023-10-25 21:34:42,541 epoch 8 - iter 87/292 - loss 0.01768261 - time (sec): 4.11 - samples/sec: 3273.21 - lr: 0.000015 - momentum: 0.000000
|
183 |
+
2023-10-25 21:34:43,782 epoch 8 - iter 116/292 - loss 0.01689601 - time (sec): 5.35 - samples/sec: 3291.07 - lr: 0.000015 - momentum: 0.000000
|
184 |
+
2023-10-25 21:34:45,038 epoch 8 - iter 145/292 - loss 0.01534873 - time (sec): 6.61 - samples/sec: 3295.91 - lr: 0.000014 - momentum: 0.000000
|
185 |
+
2023-10-25 21:34:46,352 epoch 8 - iter 174/292 - loss 0.01693279 - time (sec): 7.92 - samples/sec: 3280.78 - lr: 0.000013 - momentum: 0.000000
|
186 |
+
2023-10-25 21:34:47,614 epoch 8 - iter 203/292 - loss 0.01577629 - time (sec): 9.19 - samples/sec: 3233.29 - lr: 0.000013 - momentum: 0.000000
|
187 |
+
2023-10-25 21:34:48,896 epoch 8 - iter 232/292 - loss 0.01567376 - time (sec): 10.47 - samples/sec: 3264.57 - lr: 0.000012 - momentum: 0.000000
|
188 |
+
2023-10-25 21:34:50,171 epoch 8 - iter 261/292 - loss 0.01480935 - time (sec): 11.74 - samples/sec: 3318.43 - lr: 0.000012 - momentum: 0.000000
|
189 |
+
2023-10-25 21:34:51,559 epoch 8 - iter 290/292 - loss 0.01450321 - time (sec): 13.13 - samples/sec: 3369.14 - lr: 0.000011 - momentum: 0.000000
|
190 |
+
2023-10-25 21:34:51,643 ----------------------------------------------------------------------------------------------------
|
191 |
+
2023-10-25 21:34:51,643 EPOCH 8 done: loss 0.0144 - lr: 0.000011
|
192 |
+
2023-10-25 21:34:52,565 DEV : loss 0.2024029940366745 - f1-score (micro avg) 0.7134
|
193 |
+
2023-10-25 21:34:52,569 ----------------------------------------------------------------------------------------------------
|
194 |
+
2023-10-25 21:34:53,951 epoch 9 - iter 29/292 - loss 0.00639943 - time (sec): 1.38 - samples/sec: 3617.05 - lr: 0.000011 - momentum: 0.000000
|
195 |
+
2023-10-25 21:34:55,179 epoch 9 - iter 58/292 - loss 0.00947452 - time (sec): 2.61 - samples/sec: 3552.62 - lr: 0.000010 - momentum: 0.000000
|
196 |
+
2023-10-25 21:34:56,462 epoch 9 - iter 87/292 - loss 0.00782213 - time (sec): 3.89 - samples/sec: 3553.60 - lr: 0.000010 - momentum: 0.000000
|
197 |
+
2023-10-25 21:34:57,797 epoch 9 - iter 116/292 - loss 0.01172703 - time (sec): 5.23 - samples/sec: 3543.78 - lr: 0.000009 - momentum: 0.000000
|
198 |
+
2023-10-25 21:34:59,111 epoch 9 - iter 145/292 - loss 0.01086021 - time (sec): 6.54 - samples/sec: 3507.32 - lr: 0.000008 - momentum: 0.000000
|
199 |
+
2023-10-25 21:35:00,406 epoch 9 - iter 174/292 - loss 0.01055746 - time (sec): 7.84 - samples/sec: 3482.07 - lr: 0.000008 - momentum: 0.000000
|
200 |
+
2023-10-25 21:35:01,686 epoch 9 - iter 203/292 - loss 0.00948365 - time (sec): 9.12 - samples/sec: 3480.90 - lr: 0.000007 - momentum: 0.000000
|
201 |
+
2023-10-25 21:35:02,926 epoch 9 - iter 232/292 - loss 0.00922094 - time (sec): 10.36 - samples/sec: 3441.15 - lr: 0.000007 - momentum: 0.000000
|
202 |
+
2023-10-25 21:35:04,220 epoch 9 - iter 261/292 - loss 0.00941786 - time (sec): 11.65 - samples/sec: 3399.49 - lr: 0.000006 - momentum: 0.000000
|
203 |
+
2023-10-25 21:35:05,542 epoch 9 - iter 290/292 - loss 0.00868448 - time (sec): 12.97 - samples/sec: 3404.17 - lr: 0.000006 - momentum: 0.000000
|
204 |
+
2023-10-25 21:35:05,627 ----------------------------------------------------------------------------------------------------
|
205 |
+
2023-10-25 21:35:05,627 EPOCH 9 done: loss 0.0086 - lr: 0.000006
|
206 |
+
2023-10-25 21:35:06,545 DEV : loss 0.21211808919906616 - f1-score (micro avg) 0.7403
|
207 |
+
2023-10-25 21:35:06,549 ----------------------------------------------------------------------------------------------------
|
208 |
+
2023-10-25 21:35:07,830 epoch 10 - iter 29/292 - loss 0.00154113 - time (sec): 1.28 - samples/sec: 3413.30 - lr: 0.000005 - momentum: 0.000000
|
209 |
+
2023-10-25 21:35:09,139 epoch 10 - iter 58/292 - loss 0.00087880 - time (sec): 2.59 - samples/sec: 3179.05 - lr: 0.000005 - momentum: 0.000000
|
210 |
+
2023-10-25 21:35:10,428 epoch 10 - iter 87/292 - loss 0.00742925 - time (sec): 3.88 - samples/sec: 3189.66 - lr: 0.000004 - momentum: 0.000000
|
211 |
+
2023-10-25 21:35:11,675 epoch 10 - iter 116/292 - loss 0.00728705 - time (sec): 5.12 - samples/sec: 3255.16 - lr: 0.000003 - momentum: 0.000000
|
212 |
+
2023-10-25 21:35:13,053 epoch 10 - iter 145/292 - loss 0.00611792 - time (sec): 6.50 - samples/sec: 3311.10 - lr: 0.000003 - momentum: 0.000000
|
213 |
+
2023-10-25 21:35:14,289 epoch 10 - iter 174/292 - loss 0.00635455 - time (sec): 7.74 - samples/sec: 3345.65 - lr: 0.000002 - momentum: 0.000000
|
214 |
+
2023-10-25 21:35:15,631 epoch 10 - iter 203/292 - loss 0.00623399 - time (sec): 9.08 - samples/sec: 3401.46 - lr: 0.000002 - momentum: 0.000000
|
215 |
+
2023-10-25 21:35:16,944 epoch 10 - iter 232/292 - loss 0.00631900 - time (sec): 10.39 - samples/sec: 3381.05 - lr: 0.000001 - momentum: 0.000000
|
216 |
+
2023-10-25 21:35:18,296 epoch 10 - iter 261/292 - loss 0.00618350 - time (sec): 11.75 - samples/sec: 3378.37 - lr: 0.000001 - momentum: 0.000000
|
217 |
+
2023-10-25 21:35:19,587 epoch 10 - iter 290/292 - loss 0.00631463 - time (sec): 13.04 - samples/sec: 3395.56 - lr: 0.000000 - momentum: 0.000000
|
218 |
+
2023-10-25 21:35:19,663 ----------------------------------------------------------------------------------------------------
|
219 |
+
2023-10-25 21:35:19,663 EPOCH 10 done: loss 0.0063 - lr: 0.000000
|
220 |
+
2023-10-25 21:35:20,570 DEV : loss 0.21458660066127777 - f1-score (micro avg) 0.7179
|
221 |
+
2023-10-25 21:35:21,093 ----------------------------------------------------------------------------------------------------
|
222 |
+
2023-10-25 21:35:21,094 Loading model from best epoch ...
|
223 |
+
2023-10-25 21:35:22,805 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
|
224 |
+
2023-10-25 21:35:24,351
|
225 |
+
Results:
|
226 |
+
- F-score (micro) 0.7601
|
227 |
+
- F-score (macro) 0.6983
|
228 |
+
- Accuracy 0.6367
|
229 |
+
|
230 |
+
By class:
|
231 |
+
precision recall f1-score support
|
232 |
+
|
233 |
+
PER 0.8000 0.8391 0.8191 348
|
234 |
+
LOC 0.6709 0.8123 0.7348 261
|
235 |
+
ORG 0.5102 0.4808 0.4950 52
|
236 |
+
HumanProd 0.7619 0.7273 0.7442 22
|
237 |
+
|
238 |
+
micro avg 0.7257 0.7980 0.7601 683
|
239 |
+
macro avg 0.6857 0.7148 0.6983 683
|
240 |
+
weighted avg 0.7274 0.7980 0.7598 683
|
241 |
+
|
242 |
+
2023-10-25 21:35:24,351 ----------------------------------------------------------------------------------------------------
|