File size: 25,462 Bytes
64919c1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
2023-10-19 02:07:58,307 ----------------------------------------------------------------------------------------------------
2023-10-19 02:07:58,308 Model: "SequenceTagger(
  (embeddings): TransformerWordEmbeddings(
    (model): BertModel(
      (embeddings): BertEmbeddings(
        (word_embeddings): Embedding(31103, 768)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): BertEncoder(
        (layer): ModuleList(
          (0-11): 12 x BertLayer(
            (attention): BertAttention(
              (self): BertSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): BertSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): BertIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): BertOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (pooler): BertPooler(
        (dense): Linear(in_features=768, out_features=768, bias=True)
        (activation): Tanh()
      )
    )
  )
  (locked_dropout): LockedDropout(p=0.5)
  (linear): Linear(in_features=768, out_features=81, bias=True)
  (loss_function): CrossEntropyLoss()
)"
2023-10-19 02:07:58,308 ----------------------------------------------------------------------------------------------------
2023-10-19 02:07:58,309 Corpus: 6900 train + 1576 dev + 1833 test sentences
2023-10-19 02:07:58,309 ----------------------------------------------------------------------------------------------------
2023-10-19 02:07:58,309 Train:  6900 sentences
2023-10-19 02:07:58,309         (train_with_dev=False, train_with_test=False)
2023-10-19 02:07:58,309 ----------------------------------------------------------------------------------------------------
2023-10-19 02:07:58,309 Training Params:
2023-10-19 02:07:58,309  - learning_rate: "3e-05" 
2023-10-19 02:07:58,309  - mini_batch_size: "16"
2023-10-19 02:07:58,309  - max_epochs: "10"
2023-10-19 02:07:58,309  - shuffle: "True"
2023-10-19 02:07:58,309 ----------------------------------------------------------------------------------------------------
2023-10-19 02:07:58,309 Plugins:
2023-10-19 02:07:58,309  - TensorboardLogger
2023-10-19 02:07:58,309  - LinearScheduler | warmup_fraction: '0.1'
2023-10-19 02:07:58,309 ----------------------------------------------------------------------------------------------------
2023-10-19 02:07:58,309 Final evaluation on model from best epoch (best-model.pt)
2023-10-19 02:07:58,309  - metric: "('micro avg', 'f1-score')"
2023-10-19 02:07:58,309 ----------------------------------------------------------------------------------------------------
2023-10-19 02:07:58,309 Computation:
2023-10-19 02:07:58,310  - compute on device: cuda:0
2023-10-19 02:07:58,310  - embedding storage: none
2023-10-19 02:07:58,310 ----------------------------------------------------------------------------------------------------
2023-10-19 02:07:58,310 Model training base path: "autotrain-flair-mobie-gbert_base-bs16-e10-lr3e-05-4"
2023-10-19 02:07:58,310 ----------------------------------------------------------------------------------------------------
2023-10-19 02:07:58,310 ----------------------------------------------------------------------------------------------------
2023-10-19 02:07:58,310 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-19 02:08:13,517 epoch 1 - iter 43/432 - loss 4.93250462 - time (sec): 15.21 - samples/sec: 429.82 - lr: 0.000003 - momentum: 0.000000
2023-10-19 02:08:28,839 epoch 1 - iter 86/432 - loss 3.97734066 - time (sec): 30.53 - samples/sec: 411.99 - lr: 0.000006 - momentum: 0.000000
2023-10-19 02:08:43,873 epoch 1 - iter 129/432 - loss 3.26708172 - time (sec): 45.56 - samples/sec: 408.49 - lr: 0.000009 - momentum: 0.000000
2023-10-19 02:08:59,143 epoch 1 - iter 172/432 - loss 2.86543349 - time (sec): 60.83 - samples/sec: 410.11 - lr: 0.000012 - momentum: 0.000000
2023-10-19 02:09:15,576 epoch 1 - iter 215/432 - loss 2.59028422 - time (sec): 77.26 - samples/sec: 401.88 - lr: 0.000015 - momentum: 0.000000
2023-10-19 02:09:30,082 epoch 1 - iter 258/432 - loss 2.35703794 - time (sec): 91.77 - samples/sec: 406.04 - lr: 0.000018 - momentum: 0.000000
2023-10-19 02:09:44,936 epoch 1 - iter 301/432 - loss 2.15806965 - time (sec): 106.63 - samples/sec: 406.83 - lr: 0.000021 - momentum: 0.000000
2023-10-19 02:09:59,763 epoch 1 - iter 344/432 - loss 2.00613180 - time (sec): 121.45 - samples/sec: 408.73 - lr: 0.000024 - momentum: 0.000000
2023-10-19 02:10:13,958 epoch 1 - iter 387/432 - loss 1.88001586 - time (sec): 135.65 - samples/sec: 409.08 - lr: 0.000027 - momentum: 0.000000
2023-10-19 02:10:29,196 epoch 1 - iter 430/432 - loss 1.76470516 - time (sec): 150.89 - samples/sec: 408.67 - lr: 0.000030 - momentum: 0.000000
2023-10-19 02:10:29,817 ----------------------------------------------------------------------------------------------------
2023-10-19 02:10:29,817 EPOCH 1 done: loss 1.7614 - lr: 0.000030
2023-10-19 02:10:43,326 DEV : loss 0.5575976967811584 - f1-score (micro avg)  0.6297
2023-10-19 02:10:43,351 saving best model
2023-10-19 02:10:43,792 ----------------------------------------------------------------------------------------------------
2023-10-19 02:10:58,128 epoch 2 - iter 43/432 - loss 0.62560817 - time (sec): 14.33 - samples/sec: 414.75 - lr: 0.000030 - momentum: 0.000000
2023-10-19 02:11:12,293 epoch 2 - iter 86/432 - loss 0.61003222 - time (sec): 28.50 - samples/sec: 441.84 - lr: 0.000029 - momentum: 0.000000
2023-10-19 02:11:27,392 epoch 2 - iter 129/432 - loss 0.58076243 - time (sec): 43.60 - samples/sec: 420.07 - lr: 0.000029 - momentum: 0.000000
2023-10-19 02:11:41,758 epoch 2 - iter 172/432 - loss 0.56106454 - time (sec): 57.96 - samples/sec: 421.61 - lr: 0.000029 - momentum: 0.000000
2023-10-19 02:11:56,333 epoch 2 - iter 215/432 - loss 0.54986490 - time (sec): 72.54 - samples/sec: 419.66 - lr: 0.000028 - momentum: 0.000000
2023-10-19 02:12:11,136 epoch 2 - iter 258/432 - loss 0.53748417 - time (sec): 87.34 - samples/sec: 421.39 - lr: 0.000028 - momentum: 0.000000
2023-10-19 02:12:26,691 epoch 2 - iter 301/432 - loss 0.52065632 - time (sec): 102.90 - samples/sec: 417.10 - lr: 0.000028 - momentum: 0.000000
2023-10-19 02:12:42,153 epoch 2 - iter 344/432 - loss 0.50823824 - time (sec): 118.36 - samples/sec: 411.50 - lr: 0.000027 - momentum: 0.000000
2023-10-19 02:12:58,398 epoch 2 - iter 387/432 - loss 0.49509518 - time (sec): 134.60 - samples/sec: 409.60 - lr: 0.000027 - momentum: 0.000000
2023-10-19 02:13:13,413 epoch 2 - iter 430/432 - loss 0.48286405 - time (sec): 149.62 - samples/sec: 412.11 - lr: 0.000027 - momentum: 0.000000
2023-10-19 02:13:13,985 ----------------------------------------------------------------------------------------------------
2023-10-19 02:13:13,986 EPOCH 2 done: loss 0.4828 - lr: 0.000027
2023-10-19 02:13:27,322 DEV : loss 0.3526449203491211 - f1-score (micro avg)  0.7754
2023-10-19 02:13:27,346 saving best model
2023-10-19 02:13:28,590 ----------------------------------------------------------------------------------------------------
2023-10-19 02:13:43,341 epoch 3 - iter 43/432 - loss 0.31230210 - time (sec): 14.75 - samples/sec: 421.77 - lr: 0.000026 - momentum: 0.000000
2023-10-19 02:13:57,500 epoch 3 - iter 86/432 - loss 0.30333100 - time (sec): 28.91 - samples/sec: 427.74 - lr: 0.000026 - momentum: 0.000000
2023-10-19 02:14:12,357 epoch 3 - iter 129/432 - loss 0.29986924 - time (sec): 43.76 - samples/sec: 420.79 - lr: 0.000026 - momentum: 0.000000
2023-10-19 02:14:27,258 epoch 3 - iter 172/432 - loss 0.30292860 - time (sec): 58.67 - samples/sec: 422.53 - lr: 0.000025 - momentum: 0.000000
2023-10-19 02:14:42,390 epoch 3 - iter 215/432 - loss 0.30311275 - time (sec): 73.80 - samples/sec: 417.71 - lr: 0.000025 - momentum: 0.000000
2023-10-19 02:14:57,121 epoch 3 - iter 258/432 - loss 0.30056148 - time (sec): 88.53 - samples/sec: 417.74 - lr: 0.000025 - momentum: 0.000000
2023-10-19 02:15:12,777 epoch 3 - iter 301/432 - loss 0.30074375 - time (sec): 104.19 - samples/sec: 416.07 - lr: 0.000024 - momentum: 0.000000
2023-10-19 02:15:27,941 epoch 3 - iter 344/432 - loss 0.30006914 - time (sec): 119.35 - samples/sec: 414.44 - lr: 0.000024 - momentum: 0.000000
2023-10-19 02:15:43,189 epoch 3 - iter 387/432 - loss 0.29793746 - time (sec): 134.60 - samples/sec: 414.38 - lr: 0.000024 - momentum: 0.000000
2023-10-19 02:15:57,165 epoch 3 - iter 430/432 - loss 0.29556403 - time (sec): 148.57 - samples/sec: 414.85 - lr: 0.000023 - momentum: 0.000000
2023-10-19 02:15:57,703 ----------------------------------------------------------------------------------------------------
2023-10-19 02:15:57,703 EPOCH 3 done: loss 0.2954 - lr: 0.000023
2023-10-19 02:16:11,087 DEV : loss 0.30149412155151367 - f1-score (micro avg)  0.8069
2023-10-19 02:16:11,111 saving best model
2023-10-19 02:16:12,352 ----------------------------------------------------------------------------------------------------
2023-10-19 02:16:27,084 epoch 4 - iter 43/432 - loss 0.21264161 - time (sec): 14.73 - samples/sec: 411.11 - lr: 0.000023 - momentum: 0.000000
2023-10-19 02:16:43,109 epoch 4 - iter 86/432 - loss 0.22163872 - time (sec): 30.76 - samples/sec: 394.07 - lr: 0.000023 - momentum: 0.000000
2023-10-19 02:16:58,411 epoch 4 - iter 129/432 - loss 0.22101676 - time (sec): 46.06 - samples/sec: 396.91 - lr: 0.000022 - momentum: 0.000000
2023-10-19 02:17:13,941 epoch 4 - iter 172/432 - loss 0.22361025 - time (sec): 61.59 - samples/sec: 395.37 - lr: 0.000022 - momentum: 0.000000
2023-10-19 02:17:27,914 epoch 4 - iter 215/432 - loss 0.22111072 - time (sec): 75.56 - samples/sec: 402.30 - lr: 0.000022 - momentum: 0.000000
2023-10-19 02:17:43,302 epoch 4 - iter 258/432 - loss 0.21935857 - time (sec): 90.95 - samples/sec: 397.99 - lr: 0.000021 - momentum: 0.000000
2023-10-19 02:17:57,934 epoch 4 - iter 301/432 - loss 0.21595980 - time (sec): 105.58 - samples/sec: 403.99 - lr: 0.000021 - momentum: 0.000000
2023-10-19 02:18:13,452 epoch 4 - iter 344/432 - loss 0.21581270 - time (sec): 121.10 - samples/sec: 408.20 - lr: 0.000021 - momentum: 0.000000
2023-10-19 02:18:28,710 epoch 4 - iter 387/432 - loss 0.21528790 - time (sec): 136.36 - samples/sec: 406.51 - lr: 0.000020 - momentum: 0.000000
2023-10-19 02:18:43,163 epoch 4 - iter 430/432 - loss 0.21420583 - time (sec): 150.81 - samples/sec: 408.75 - lr: 0.000020 - momentum: 0.000000
2023-10-19 02:18:43,749 ----------------------------------------------------------------------------------------------------
2023-10-19 02:18:43,749 EPOCH 4 done: loss 0.2144 - lr: 0.000020
2023-10-19 02:18:57,091 DEV : loss 0.3102978467941284 - f1-score (micro avg)  0.8163
2023-10-19 02:18:57,116 saving best model
2023-10-19 02:18:58,362 ----------------------------------------------------------------------------------------------------
2023-10-19 02:19:12,811 epoch 5 - iter 43/432 - loss 0.15083744 - time (sec): 14.45 - samples/sec: 412.38 - lr: 0.000020 - momentum: 0.000000
2023-10-19 02:19:27,463 epoch 5 - iter 86/432 - loss 0.15320825 - time (sec): 29.10 - samples/sec: 418.60 - lr: 0.000019 - momentum: 0.000000
2023-10-19 02:19:42,164 epoch 5 - iter 129/432 - loss 0.15857775 - time (sec): 43.80 - samples/sec: 428.40 - lr: 0.000019 - momentum: 0.000000
2023-10-19 02:19:57,171 epoch 5 - iter 172/432 - loss 0.15560054 - time (sec): 58.81 - samples/sec: 426.51 - lr: 0.000019 - momentum: 0.000000
2023-10-19 02:20:12,653 epoch 5 - iter 215/432 - loss 0.15299635 - time (sec): 74.29 - samples/sec: 412.79 - lr: 0.000018 - momentum: 0.000000
2023-10-19 02:20:26,815 epoch 5 - iter 258/432 - loss 0.15415565 - time (sec): 88.45 - samples/sec: 413.63 - lr: 0.000018 - momentum: 0.000000
2023-10-19 02:20:41,252 epoch 5 - iter 301/432 - loss 0.15470623 - time (sec): 102.89 - samples/sec: 415.99 - lr: 0.000018 - momentum: 0.000000
2023-10-19 02:20:57,180 epoch 5 - iter 344/432 - loss 0.15738806 - time (sec): 118.82 - samples/sec: 413.47 - lr: 0.000017 - momentum: 0.000000
2023-10-19 02:21:12,778 epoch 5 - iter 387/432 - loss 0.15861707 - time (sec): 134.41 - samples/sec: 411.86 - lr: 0.000017 - momentum: 0.000000
2023-10-19 02:21:28,825 epoch 5 - iter 430/432 - loss 0.15871889 - time (sec): 150.46 - samples/sec: 409.68 - lr: 0.000017 - momentum: 0.000000
2023-10-19 02:21:29,386 ----------------------------------------------------------------------------------------------------
2023-10-19 02:21:29,387 EPOCH 5 done: loss 0.1591 - lr: 0.000017
2023-10-19 02:21:42,711 DEV : loss 0.3180293142795563 - f1-score (micro avg)  0.8294
2023-10-19 02:21:42,736 saving best model
2023-10-19 02:21:43,978 ----------------------------------------------------------------------------------------------------
2023-10-19 02:21:58,649 epoch 6 - iter 43/432 - loss 0.11722725 - time (sec): 14.67 - samples/sec: 429.78 - lr: 0.000016 - momentum: 0.000000
2023-10-19 02:22:13,269 epoch 6 - iter 86/432 - loss 0.11922248 - time (sec): 29.29 - samples/sec: 428.61 - lr: 0.000016 - momentum: 0.000000
2023-10-19 02:22:29,215 epoch 6 - iter 129/432 - loss 0.11571691 - time (sec): 45.24 - samples/sec: 417.24 - lr: 0.000016 - momentum: 0.000000
2023-10-19 02:22:44,481 epoch 6 - iter 172/432 - loss 0.11512543 - time (sec): 60.50 - samples/sec: 415.94 - lr: 0.000015 - momentum: 0.000000
2023-10-19 02:22:59,365 epoch 6 - iter 215/432 - loss 0.11922747 - time (sec): 75.39 - samples/sec: 415.99 - lr: 0.000015 - momentum: 0.000000
2023-10-19 02:23:13,706 epoch 6 - iter 258/432 - loss 0.12213480 - time (sec): 89.73 - samples/sec: 412.85 - lr: 0.000015 - momentum: 0.000000
2023-10-19 02:23:28,113 epoch 6 - iter 301/432 - loss 0.12324684 - time (sec): 104.13 - samples/sec: 413.96 - lr: 0.000014 - momentum: 0.000000
2023-10-19 02:23:42,639 epoch 6 - iter 344/432 - loss 0.12430268 - time (sec): 118.66 - samples/sec: 415.94 - lr: 0.000014 - momentum: 0.000000
2023-10-19 02:23:57,051 epoch 6 - iter 387/432 - loss 0.12584428 - time (sec): 133.07 - samples/sec: 416.41 - lr: 0.000014 - momentum: 0.000000
2023-10-19 02:24:11,447 epoch 6 - iter 430/432 - loss 0.12693815 - time (sec): 147.47 - samples/sec: 418.17 - lr: 0.000013 - momentum: 0.000000
2023-10-19 02:24:12,214 ----------------------------------------------------------------------------------------------------
2023-10-19 02:24:12,215 EPOCH 6 done: loss 0.1269 - lr: 0.000013
2023-10-19 02:24:25,523 DEV : loss 0.32838910818099976 - f1-score (micro avg)  0.8206
2023-10-19 02:24:25,547 ----------------------------------------------------------------------------------------------------
2023-10-19 02:24:39,692 epoch 7 - iter 43/432 - loss 0.08912504 - time (sec): 14.14 - samples/sec: 441.62 - lr: 0.000013 - momentum: 0.000000
2023-10-19 02:24:54,333 epoch 7 - iter 86/432 - loss 0.09693013 - time (sec): 28.78 - samples/sec: 423.29 - lr: 0.000013 - momentum: 0.000000
2023-10-19 02:25:09,997 epoch 7 - iter 129/432 - loss 0.09674099 - time (sec): 44.45 - samples/sec: 416.87 - lr: 0.000012 - momentum: 0.000000
2023-10-19 02:25:24,036 epoch 7 - iter 172/432 - loss 0.09757105 - time (sec): 58.49 - samples/sec: 418.36 - lr: 0.000012 - momentum: 0.000000
2023-10-19 02:25:38,207 epoch 7 - iter 215/432 - loss 0.09807207 - time (sec): 72.66 - samples/sec: 416.52 - lr: 0.000012 - momentum: 0.000000
2023-10-19 02:25:53,354 epoch 7 - iter 258/432 - loss 0.09815033 - time (sec): 87.81 - samples/sec: 414.87 - lr: 0.000011 - momentum: 0.000000
2023-10-19 02:26:08,690 epoch 7 - iter 301/432 - loss 0.09658140 - time (sec): 103.14 - samples/sec: 415.61 - lr: 0.000011 - momentum: 0.000000
2023-10-19 02:26:23,350 epoch 7 - iter 344/432 - loss 0.09822029 - time (sec): 117.80 - samples/sec: 414.25 - lr: 0.000011 - momentum: 0.000000
2023-10-19 02:26:38,076 epoch 7 - iter 387/432 - loss 0.09934983 - time (sec): 132.53 - samples/sec: 417.40 - lr: 0.000010 - momentum: 0.000000
2023-10-19 02:26:53,511 epoch 7 - iter 430/432 - loss 0.10039982 - time (sec): 147.96 - samples/sec: 416.66 - lr: 0.000010 - momentum: 0.000000
2023-10-19 02:26:53,991 ----------------------------------------------------------------------------------------------------
2023-10-19 02:26:53,991 EPOCH 7 done: loss 0.1008 - lr: 0.000010
2023-10-19 02:27:07,624 DEV : loss 0.34814995527267456 - f1-score (micro avg)  0.832
2023-10-19 02:27:07,647 saving best model
2023-10-19 02:27:08,914 ----------------------------------------------------------------------------------------------------
2023-10-19 02:27:22,666 epoch 8 - iter 43/432 - loss 0.10350894 - time (sec): 13.75 - samples/sec: 470.25 - lr: 0.000010 - momentum: 0.000000
2023-10-19 02:27:36,004 epoch 8 - iter 86/432 - loss 0.09852409 - time (sec): 27.09 - samples/sec: 476.91 - lr: 0.000009 - momentum: 0.000000
2023-10-19 02:27:50,049 epoch 8 - iter 129/432 - loss 0.09286219 - time (sec): 41.13 - samples/sec: 465.85 - lr: 0.000009 - momentum: 0.000000
2023-10-19 02:28:03,617 epoch 8 - iter 172/432 - loss 0.08743720 - time (sec): 54.70 - samples/sec: 455.60 - lr: 0.000009 - momentum: 0.000000
2023-10-19 02:28:17,277 epoch 8 - iter 215/432 - loss 0.08573511 - time (sec): 68.36 - samples/sec: 459.17 - lr: 0.000008 - momentum: 0.000000
2023-10-19 02:28:30,969 epoch 8 - iter 258/432 - loss 0.08331380 - time (sec): 82.05 - samples/sec: 462.63 - lr: 0.000008 - momentum: 0.000000
2023-10-19 02:28:44,354 epoch 8 - iter 301/432 - loss 0.08249316 - time (sec): 95.44 - samples/sec: 457.26 - lr: 0.000008 - momentum: 0.000000
2023-10-19 02:28:58,899 epoch 8 - iter 344/432 - loss 0.08196643 - time (sec): 109.98 - samples/sec: 448.87 - lr: 0.000007 - momentum: 0.000000
2023-10-19 02:29:12,883 epoch 8 - iter 387/432 - loss 0.08297009 - time (sec): 123.97 - samples/sec: 447.78 - lr: 0.000007 - momentum: 0.000000
2023-10-19 02:29:26,795 epoch 8 - iter 430/432 - loss 0.08266864 - time (sec): 137.88 - samples/sec: 447.49 - lr: 0.000007 - momentum: 0.000000
2023-10-19 02:29:27,277 ----------------------------------------------------------------------------------------------------
2023-10-19 02:29:27,277 EPOCH 8 done: loss 0.0826 - lr: 0.000007
2023-10-19 02:29:39,277 DEV : loss 0.34838762879371643 - f1-score (micro avg)  0.8366
2023-10-19 02:29:39,301 saving best model
2023-10-19 02:29:40,574 ----------------------------------------------------------------------------------------------------
2023-10-19 02:29:53,319 epoch 9 - iter 43/432 - loss 0.06186272 - time (sec): 12.74 - samples/sec: 474.43 - lr: 0.000006 - momentum: 0.000000
2023-10-19 02:30:08,597 epoch 9 - iter 86/432 - loss 0.06634905 - time (sec): 28.02 - samples/sec: 422.46 - lr: 0.000006 - momentum: 0.000000
2023-10-19 02:30:22,346 epoch 9 - iter 129/432 - loss 0.07167375 - time (sec): 41.77 - samples/sec: 424.58 - lr: 0.000006 - momentum: 0.000000
2023-10-19 02:30:36,275 epoch 9 - iter 172/432 - loss 0.06969364 - time (sec): 55.70 - samples/sec: 426.04 - lr: 0.000005 - momentum: 0.000000
2023-10-19 02:30:50,140 epoch 9 - iter 215/432 - loss 0.06782716 - time (sec): 69.56 - samples/sec: 430.13 - lr: 0.000005 - momentum: 0.000000
2023-10-19 02:31:04,623 epoch 9 - iter 258/432 - loss 0.06830674 - time (sec): 84.05 - samples/sec: 428.69 - lr: 0.000005 - momentum: 0.000000
2023-10-19 02:31:18,536 epoch 9 - iter 301/432 - loss 0.06808899 - time (sec): 97.96 - samples/sec: 431.85 - lr: 0.000004 - momentum: 0.000000
2023-10-19 02:31:31,833 epoch 9 - iter 344/432 - loss 0.06544761 - time (sec): 111.26 - samples/sec: 438.17 - lr: 0.000004 - momentum: 0.000000
2023-10-19 02:31:45,008 epoch 9 - iter 387/432 - loss 0.06628105 - time (sec): 124.43 - samples/sec: 443.94 - lr: 0.000004 - momentum: 0.000000
2023-10-19 02:31:58,464 epoch 9 - iter 430/432 - loss 0.06698673 - time (sec): 137.89 - samples/sec: 446.90 - lr: 0.000003 - momentum: 0.000000
2023-10-19 02:31:58,887 ----------------------------------------------------------------------------------------------------
2023-10-19 02:31:58,887 EPOCH 9 done: loss 0.0669 - lr: 0.000003
2023-10-19 02:32:10,770 DEV : loss 0.37735414505004883 - f1-score (micro avg)  0.8355
2023-10-19 02:32:10,795 ----------------------------------------------------------------------------------------------------
2023-10-19 02:32:24,493 epoch 10 - iter 43/432 - loss 0.06688660 - time (sec): 13.70 - samples/sec: 479.67 - lr: 0.000003 - momentum: 0.000000
2023-10-19 02:32:38,648 epoch 10 - iter 86/432 - loss 0.06022775 - time (sec): 27.85 - samples/sec: 445.00 - lr: 0.000003 - momentum: 0.000000
2023-10-19 02:32:52,077 epoch 10 - iter 129/432 - loss 0.05771265 - time (sec): 41.28 - samples/sec: 452.85 - lr: 0.000002 - momentum: 0.000000
2023-10-19 02:33:05,697 epoch 10 - iter 172/432 - loss 0.05574235 - time (sec): 54.90 - samples/sec: 453.57 - lr: 0.000002 - momentum: 0.000000
2023-10-19 02:33:19,731 epoch 10 - iter 215/432 - loss 0.05793529 - time (sec): 68.93 - samples/sec: 450.88 - lr: 0.000002 - momentum: 0.000000
2023-10-19 02:33:32,567 epoch 10 - iter 258/432 - loss 0.05732065 - time (sec): 81.77 - samples/sec: 451.59 - lr: 0.000001 - momentum: 0.000000
2023-10-19 02:33:45,896 epoch 10 - iter 301/432 - loss 0.05663405 - time (sec): 95.10 - samples/sec: 448.98 - lr: 0.000001 - momentum: 0.000000
2023-10-19 02:33:59,944 epoch 10 - iter 344/432 - loss 0.05723962 - time (sec): 109.15 - samples/sec: 448.98 - lr: 0.000001 - momentum: 0.000000
2023-10-19 02:34:13,929 epoch 10 - iter 387/432 - loss 0.05878775 - time (sec): 123.13 - samples/sec: 447.55 - lr: 0.000000 - momentum: 0.000000
2023-10-19 02:34:27,981 epoch 10 - iter 430/432 - loss 0.05901642 - time (sec): 137.18 - samples/sec: 449.90 - lr: 0.000000 - momentum: 0.000000
2023-10-19 02:34:28,415 ----------------------------------------------------------------------------------------------------
2023-10-19 02:34:28,415 EPOCH 10 done: loss 0.0589 - lr: 0.000000
2023-10-19 02:34:40,654 DEV : loss 0.38429826498031616 - f1-score (micro avg)  0.8381
2023-10-19 02:34:40,679 saving best model
2023-10-19 02:34:42,621 ----------------------------------------------------------------------------------------------------
2023-10-19 02:34:42,623 Loading model from best epoch ...
2023-10-19 02:34:44,813 SequenceTagger predicts: Dictionary with 81 tags: O, S-location-route, B-location-route, E-location-route, I-location-route, S-location-stop, B-location-stop, E-location-stop, I-location-stop, S-trigger, B-trigger, E-trigger, I-trigger, S-organization-company, B-organization-company, E-organization-company, I-organization-company, S-location-city, B-location-city, E-location-city, I-location-city, S-location, B-location, E-location, I-location, S-event-cause, B-event-cause, E-event-cause, I-event-cause, S-location-street, B-location-street, E-location-street, I-location-street, S-time, B-time, E-time, I-time, S-date, B-date, E-date, I-date, S-number, B-number, E-number, I-number, S-duration, B-duration, E-duration, I-duration, S-organization
2023-10-19 02:35:01,412 
Results:
- F-score (micro) 0.7634
- F-score (macro) 0.5759
- Accuracy 0.6618

By class:
                      precision    recall  f1-score   support

             trigger     0.7056    0.6158    0.6577       833
       location-stop     0.8486    0.8353    0.8419       765
            location     0.7905    0.8286    0.8091       665
       location-city     0.8088    0.8746    0.8404       566
                date     0.8836    0.8477    0.8653       394
     location-street     0.9315    0.8808    0.9055       386
                time     0.7889    0.8906    0.8367       256
      location-route     0.7976    0.6937    0.7420       284
organization-company     0.7946    0.7063    0.7479       252
            distance     0.9940    1.0000    0.9970       167
              number     0.6721    0.8255    0.7410       149
            duration     0.3455    0.3497    0.3476       163
         event-cause     0.0000    0.0000    0.0000         0
       disaster-type     0.9375    0.4348    0.5941        69
        organization     0.4706    0.5714    0.5161        28
              person     0.3636    0.8000    0.5000        10
                 set     0.0000    0.0000    0.0000         0
        org-position     0.0000    0.0000    0.0000         1
               money     0.0000    0.0000    0.0000         0

           micro avg     0.7503    0.7771    0.7634      4988
           macro avg     0.5860    0.5871    0.5759      4988
        weighted avg     0.7941    0.7771    0.7826      4988

2023-10-19 02:35:01,413 ----------------------------------------------------------------------------------------------------