lombardata commited on
Commit
b973fd7
1 Parent(s): 22a2cf4

🍻 cheers

Browse files
Files changed (6) hide show
  1. README.md +10 -6
  2. all_results.json +16 -0
  3. config.json +1 -1
  4. eval_results.json +12 -0
  5. train_results.json +8 -0
  6. trainer_state.json +1167 -0
README.md CHANGED
@@ -1,7 +1,11 @@
1
  ---
 
 
2
  license: apache-2.0
3
  base_model: facebook/dinov2-large
4
  tags:
 
 
5
  - generated_from_trainer
6
  metrics:
7
  - accuracy
@@ -15,13 +19,13 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  # dinov2-large-2024_01_05-kornia_img-size518_batch-size32_epochs70_freeze
17
 
18
- This model is a fine-tuned version of [facebook/dinov2-large](https://huggingface.co/facebook/dinov2-large) on the None dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 0.0831
21
- - F1 Micro: 0.8552
22
- - F1 Macro: 0.7487
23
- - Roc Auc: 0.9104
24
- - Accuracy: 0.5562
25
  - Learning Rate: 0.0001
26
 
27
  ## Model description
 
1
  ---
2
+ language:
3
+ - eng
4
  license: apache-2.0
5
  base_model: facebook/dinov2-large
6
  tags:
7
+ - multilabel-image-classification
8
+ - multilabel
9
  - generated_from_trainer
10
  metrics:
11
  - accuracy
 
19
 
20
  # dinov2-large-2024_01_05-kornia_img-size518_batch-size32_epochs70_freeze
21
 
22
+ This model is a fine-tuned version of [facebook/dinov2-large](https://huggingface.co/facebook/dinov2-large) on the multilabel_complete_dataset dataset.
23
  It achieves the following results on the evaluation set:
24
+ - Loss: 0.0840
25
+ - F1 Micro: 0.8543
26
+ - F1 Macro: 0.7343
27
+ - Roc Auc: 0.9077
28
+ - Accuracy: 0.5606
29
  - Learning Rate: 0.0001
30
 
31
  ## Model description
all_results.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 70.0,
3
+ "eval_accuracy": 0.5605742296918768,
4
+ "eval_f1_macro": 0.7342630546801885,
5
+ "eval_f1_micro": 0.8543162417321499,
6
+ "eval_loss": 0.08401281386613846,
7
+ "eval_roc_auc": 0.9076857807628663,
8
+ "eval_runtime": 670.4543,
9
+ "eval_samples_per_second": 4.26,
10
+ "eval_steps_per_second": 0.134,
11
+ "learning_rate": 0.0001,
12
+ "train_loss": 0.11672632308896316,
13
+ "train_runtime": 200748.2354,
14
+ "train_samples_per_second": 3.057,
15
+ "train_steps_per_second": 0.096
16
+ }
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "facebook/dinov2-large",
3
  "apply_layernorm": true,
4
  "architectures": [
5
  "NewheadDinov2ForImageClassification"
 
1
  {
2
+ "_name_or_path": "facebook/dinov2-large2024_01_05",
3
  "apply_layernorm": true,
4
  "architectures": [
5
  "NewheadDinov2ForImageClassification"
eval_results.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 70.0,
3
+ "eval_accuracy": 0.5605742296918768,
4
+ "eval_f1_macro": 0.7342630546801885,
5
+ "eval_f1_micro": 0.8543162417321499,
6
+ "eval_loss": 0.08401281386613846,
7
+ "eval_roc_auc": 0.9076857807628663,
8
+ "eval_runtime": 670.4543,
9
+ "eval_samples_per_second": 4.26,
10
+ "eval_steps_per_second": 0.134,
11
+ "learning_rate": 0.0001
12
+ }
train_results.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 70.0,
3
+ "learning_rate": 0.0001,
4
+ "train_loss": 0.11672632308896316,
5
+ "train_runtime": 200748.2354,
6
+ "train_samples_per_second": 3.057,
7
+ "train_steps_per_second": 0.096
8
+ }
trainer_state.json ADDED
@@ -0,0 +1,1167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": 0.08306006342172623,
3
+ "best_model_checkpoint": "/home1/datawork/mcontini/models/multilabel/huggingface/dinov2-large-2024_01_05-kornia_img-size518_batch-size32_epochs70_freeze/checkpoint-19180",
4
+ "epoch": 70.0,
5
+ "eval_steps": 500,
6
+ "global_step": 19180,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 1.0,
13
+ "eval_accuracy": 0.44556873691556176,
14
+ "eval_f1_macro": 0.5755948244574681,
15
+ "eval_f1_micro": 0.7376394107473012,
16
+ "eval_loss": 0.13577787578105927,
17
+ "eval_roc_auc": 0.8276012534175776,
18
+ "eval_runtime": 686.0469,
19
+ "eval_samples_per_second": 4.178,
20
+ "eval_steps_per_second": 0.131,
21
+ "learning_rate": 0.01,
22
+ "step": 274
23
+ },
24
+ {
25
+ "epoch": 1.82,
26
+ "learning_rate": 0.01,
27
+ "loss": 0.1895,
28
+ "step": 500
29
+ },
30
+ {
31
+ "epoch": 2.0,
32
+ "eval_accuracy": 0.4357990230286113,
33
+ "eval_f1_macro": 0.6131029690652663,
34
+ "eval_f1_micro": 0.7463369963369964,
35
+ "eval_loss": 0.14224015176296234,
36
+ "eval_roc_auc": 0.8432701847378548,
37
+ "eval_runtime": 679.513,
38
+ "eval_samples_per_second": 4.218,
39
+ "eval_steps_per_second": 0.132,
40
+ "learning_rate": 0.01,
41
+ "step": 548
42
+ },
43
+ {
44
+ "epoch": 3.0,
45
+ "eval_accuracy": 0.38415910676901605,
46
+ "eval_f1_macro": 0.5242425898328716,
47
+ "eval_f1_micro": 0.7273147345925026,
48
+ "eval_loss": 0.21337130665779114,
49
+ "eval_roc_auc": 0.8305454415797603,
50
+ "eval_runtime": 681.6164,
51
+ "eval_samples_per_second": 4.205,
52
+ "eval_steps_per_second": 0.132,
53
+ "learning_rate": 0.01,
54
+ "step": 822
55
+ },
56
+ {
57
+ "epoch": 3.65,
58
+ "learning_rate": 0.01,
59
+ "loss": 0.1668,
60
+ "step": 1000
61
+ },
62
+ {
63
+ "epoch": 4.0,
64
+ "eval_accuracy": 0.4438241451500349,
65
+ "eval_f1_macro": 0.5474301561230492,
66
+ "eval_f1_micro": 0.7034210860994532,
67
+ "eval_loss": 0.14501234889030457,
68
+ "eval_roc_auc": 0.7947377699358407,
69
+ "eval_runtime": 676.7895,
70
+ "eval_samples_per_second": 4.235,
71
+ "eval_steps_per_second": 0.133,
72
+ "learning_rate": 0.01,
73
+ "step": 1096
74
+ },
75
+ {
76
+ "epoch": 5.0,
77
+ "eval_accuracy": 0.4438241451500349,
78
+ "eval_f1_macro": 0.6194844495540315,
79
+ "eval_f1_micro": 0.7611423380457615,
80
+ "eval_loss": 0.13293854892253876,
81
+ "eval_roc_auc": 0.8535844928345971,
82
+ "eval_runtime": 676.1205,
83
+ "eval_samples_per_second": 4.239,
84
+ "eval_steps_per_second": 0.133,
85
+ "learning_rate": 0.01,
86
+ "step": 1370
87
+ },
88
+ {
89
+ "epoch": 5.47,
90
+ "learning_rate": 0.01,
91
+ "loss": 0.1666,
92
+ "step": 1500
93
+ },
94
+ {
95
+ "epoch": 6.0,
96
+ "eval_accuracy": 0.44452198185624564,
97
+ "eval_f1_macro": 0.5624987041776927,
98
+ "eval_f1_micro": 0.752847713067352,
99
+ "eval_loss": 0.13243332505226135,
100
+ "eval_roc_auc": 0.8411441150969292,
101
+ "eval_runtime": 676.3907,
102
+ "eval_samples_per_second": 4.237,
103
+ "eval_steps_per_second": 0.133,
104
+ "learning_rate": 0.01,
105
+ "step": 1644
106
+ },
107
+ {
108
+ "epoch": 7.0,
109
+ "eval_accuracy": 0.43126308443824146,
110
+ "eval_f1_macro": 0.5689553622505643,
111
+ "eval_f1_micro": 0.7496488764044945,
112
+ "eval_loss": 0.13447266817092896,
113
+ "eval_roc_auc": 0.838952950800037,
114
+ "eval_runtime": 675.7851,
115
+ "eval_samples_per_second": 4.241,
116
+ "eval_steps_per_second": 0.133,
117
+ "learning_rate": 0.01,
118
+ "step": 1918
119
+ },
120
+ {
121
+ "epoch": 7.3,
122
+ "learning_rate": 0.01,
123
+ "loss": 0.1664,
124
+ "step": 2000
125
+ },
126
+ {
127
+ "epoch": 8.0,
128
+ "eval_accuracy": 0.4323098394975576,
129
+ "eval_f1_macro": 0.5627920395195278,
130
+ "eval_f1_micro": 0.7502482911725186,
131
+ "eval_loss": 0.13808754086494446,
132
+ "eval_roc_auc": 0.8397325865953646,
133
+ "eval_runtime": 681.8119,
134
+ "eval_samples_per_second": 4.204,
135
+ "eval_steps_per_second": 0.132,
136
+ "learning_rate": 0.01,
137
+ "step": 2192
138
+ },
139
+ {
140
+ "epoch": 9.0,
141
+ "eval_accuracy": 0.44033496161898117,
142
+ "eval_f1_macro": 0.5492251158735639,
143
+ "eval_f1_micro": 0.7395667604944316,
144
+ "eval_loss": 0.13694943487644196,
145
+ "eval_roc_auc": 0.8219722316265465,
146
+ "eval_runtime": 674.9771,
147
+ "eval_samples_per_second": 4.246,
148
+ "eval_steps_per_second": 0.133,
149
+ "learning_rate": 0.01,
150
+ "step": 2466
151
+ },
152
+ {
153
+ "epoch": 9.12,
154
+ "learning_rate": 0.01,
155
+ "loss": 0.1656,
156
+ "step": 2500
157
+ },
158
+ {
159
+ "epoch": 10.0,
160
+ "eval_accuracy": 0.4424284717376134,
161
+ "eval_f1_macro": 0.528180519175626,
162
+ "eval_f1_micro": 0.7326552851434799,
163
+ "eval_loss": 0.13609227538108826,
164
+ "eval_roc_auc": 0.821178691278072,
165
+ "eval_runtime": 674.2621,
166
+ "eval_samples_per_second": 4.251,
167
+ "eval_steps_per_second": 0.133,
168
+ "learning_rate": 0.01,
169
+ "step": 2740
170
+ },
171
+ {
172
+ "epoch": 10.95,
173
+ "learning_rate": 0.01,
174
+ "loss": 0.166,
175
+ "step": 3000
176
+ },
177
+ {
178
+ "epoch": 11.0,
179
+ "eval_accuracy": 0.4277739009071877,
180
+ "eval_f1_macro": 0.5428317486981787,
181
+ "eval_f1_micro": 0.7434225844004656,
182
+ "eval_loss": 0.1380929797887802,
183
+ "eval_roc_auc": 0.8371006883503846,
184
+ "eval_runtime": 676.158,
185
+ "eval_samples_per_second": 4.239,
186
+ "eval_steps_per_second": 0.133,
187
+ "learning_rate": 0.01,
188
+ "step": 3014
189
+ },
190
+ {
191
+ "epoch": 12.0,
192
+ "eval_accuracy": 0.444870900209351,
193
+ "eval_f1_macro": 0.5618568055480317,
194
+ "eval_f1_micro": 0.7354685646500594,
195
+ "eval_loss": 0.1344844251871109,
196
+ "eval_roc_auc": 0.827911942291835,
197
+ "eval_runtime": 683.7162,
198
+ "eval_samples_per_second": 4.192,
199
+ "eval_steps_per_second": 0.132,
200
+ "learning_rate": 0.01,
201
+ "step": 3288
202
+ },
203
+ {
204
+ "epoch": 12.77,
205
+ "learning_rate": 0.001,
206
+ "loss": 0.1585,
207
+ "step": 3500
208
+ },
209
+ {
210
+ "epoch": 13.0,
211
+ "eval_accuracy": 0.49023028611304953,
212
+ "eval_f1_macro": 0.650060261894195,
213
+ "eval_f1_micro": 0.8009333029820168,
214
+ "eval_loss": 0.11546628922224045,
215
+ "eval_roc_auc": 0.8745955707824836,
216
+ "eval_runtime": 679.0864,
217
+ "eval_samples_per_second": 4.22,
218
+ "eval_steps_per_second": 0.133,
219
+ "learning_rate": 0.001,
220
+ "step": 3562
221
+ },
222
+ {
223
+ "epoch": 14.0,
224
+ "eval_accuracy": 0.5041870202372645,
225
+ "eval_f1_macro": 0.6697333276095331,
226
+ "eval_f1_micro": 0.8079378774805867,
227
+ "eval_loss": 0.11155486851930618,
228
+ "eval_roc_auc": 0.8750976636196655,
229
+ "eval_runtime": 676.6891,
230
+ "eval_samples_per_second": 4.235,
231
+ "eval_steps_per_second": 0.133,
232
+ "learning_rate": 0.001,
233
+ "step": 3836
234
+ },
235
+ {
236
+ "epoch": 14.6,
237
+ "learning_rate": 0.001,
238
+ "loss": 0.133,
239
+ "step": 4000
240
+ },
241
+ {
242
+ "epoch": 15.0,
243
+ "eval_accuracy": 0.5181437543614794,
244
+ "eval_f1_macro": 0.6736053030113935,
245
+ "eval_f1_micro": 0.814943326393708,
246
+ "eval_loss": 0.10734836012125015,
247
+ "eval_roc_auc": 0.877205285207194,
248
+ "eval_runtime": 677.4924,
249
+ "eval_samples_per_second": 4.23,
250
+ "eval_steps_per_second": 0.133,
251
+ "learning_rate": 0.001,
252
+ "step": 4110
253
+ },
254
+ {
255
+ "epoch": 16.0,
256
+ "eval_accuracy": 0.5083740404745289,
257
+ "eval_f1_macro": 0.7055616874566738,
258
+ "eval_f1_micro": 0.8238276299112801,
259
+ "eval_loss": 0.10476414114236832,
260
+ "eval_roc_auc": 0.8975418625131631,
261
+ "eval_runtime": 687.0217,
262
+ "eval_samples_per_second": 4.172,
263
+ "eval_steps_per_second": 0.131,
264
+ "learning_rate": 0.001,
265
+ "step": 4384
266
+ },
267
+ {
268
+ "epoch": 16.42,
269
+ "learning_rate": 0.001,
270
+ "loss": 0.1289,
271
+ "step": 4500
272
+ },
273
+ {
274
+ "epoch": 17.0,
275
+ "eval_accuracy": 0.5244242847173761,
276
+ "eval_f1_macro": 0.6896485755961196,
277
+ "eval_f1_micro": 0.820858825547487,
278
+ "eval_loss": 0.10253454744815826,
279
+ "eval_roc_auc": 0.8839468587595108,
280
+ "eval_runtime": 684.6037,
281
+ "eval_samples_per_second": 4.186,
282
+ "eval_steps_per_second": 0.131,
283
+ "learning_rate": 0.001,
284
+ "step": 4658
285
+ },
286
+ {
287
+ "epoch": 18.0,
288
+ "eval_accuracy": 0.5321004884856944,
289
+ "eval_f1_macro": 0.7045003592264228,
290
+ "eval_f1_micro": 0.8289563051845145,
291
+ "eval_loss": 0.10259302705526352,
292
+ "eval_roc_auc": 0.8916264271206406,
293
+ "eval_runtime": 685.7262,
294
+ "eval_samples_per_second": 4.18,
295
+ "eval_steps_per_second": 0.131,
296
+ "learning_rate": 0.001,
297
+ "step": 4932
298
+ },
299
+ {
300
+ "epoch": 18.25,
301
+ "learning_rate": 0.001,
302
+ "loss": 0.1227,
303
+ "step": 5000
304
+ },
305
+ {
306
+ "epoch": 19.0,
307
+ "eval_accuracy": 0.5279134682484299,
308
+ "eval_f1_macro": 0.6905367219275804,
309
+ "eval_f1_micro": 0.8306010928961749,
310
+ "eval_loss": 0.10123815387487411,
311
+ "eval_roc_auc": 0.8940566516497492,
312
+ "eval_runtime": 685.5377,
313
+ "eval_samples_per_second": 4.181,
314
+ "eval_steps_per_second": 0.131,
315
+ "learning_rate": 0.001,
316
+ "step": 5206
317
+ },
318
+ {
319
+ "epoch": 20.0,
320
+ "eval_accuracy": 0.5216329378925332,
321
+ "eval_f1_macro": 0.6830881274898382,
322
+ "eval_f1_micro": 0.8280441143371596,
323
+ "eval_loss": 0.09970748424530029,
324
+ "eval_roc_auc": 0.8930346669934526,
325
+ "eval_runtime": 686.9199,
326
+ "eval_samples_per_second": 4.172,
327
+ "eval_steps_per_second": 0.131,
328
+ "learning_rate": 0.001,
329
+ "step": 5480
330
+ },
331
+ {
332
+ "epoch": 20.07,
333
+ "learning_rate": 0.001,
334
+ "loss": 0.1202,
335
+ "step": 5500
336
+ },
337
+ {
338
+ "epoch": 21.0,
339
+ "eval_accuracy": 0.5352407536636427,
340
+ "eval_f1_macro": 0.6926783323821563,
341
+ "eval_f1_micro": 0.8300336623495178,
342
+ "eval_loss": 0.09886988252401352,
343
+ "eval_roc_auc": 0.8896211857387517,
344
+ "eval_runtime": 684.9162,
345
+ "eval_samples_per_second": 4.184,
346
+ "eval_steps_per_second": 0.131,
347
+ "learning_rate": 0.001,
348
+ "step": 5754
349
+ },
350
+ {
351
+ "epoch": 21.9,
352
+ "learning_rate": 0.001,
353
+ "loss": 0.12,
354
+ "step": 6000
355
+ },
356
+ {
357
+ "epoch": 22.0,
358
+ "eval_accuracy": 0.5209351011863224,
359
+ "eval_f1_macro": 0.6961228606859606,
360
+ "eval_f1_micro": 0.8279826958105646,
361
+ "eval_loss": 0.09963646531105042,
362
+ "eval_roc_auc": 0.8892587586568824,
363
+ "eval_runtime": 686.1389,
364
+ "eval_samples_per_second": 4.177,
365
+ "eval_steps_per_second": 0.131,
366
+ "learning_rate": 0.001,
367
+ "step": 6028
368
+ },
369
+ {
370
+ "epoch": 23.0,
371
+ "eval_accuracy": 0.5195394277739009,
372
+ "eval_f1_macro": 0.6958628426894405,
373
+ "eval_f1_micro": 0.831919078392807,
374
+ "eval_loss": 0.09720779210329056,
375
+ "eval_roc_auc": 0.8955944894582717,
376
+ "eval_runtime": 693.3459,
377
+ "eval_samples_per_second": 4.134,
378
+ "eval_steps_per_second": 0.13,
379
+ "learning_rate": 0.001,
380
+ "step": 6302
381
+ },
382
+ {
383
+ "epoch": 23.72,
384
+ "learning_rate": 0.001,
385
+ "loss": 0.1179,
386
+ "step": 6500
387
+ },
388
+ {
389
+ "epoch": 24.0,
390
+ "eval_accuracy": 0.5212840195394278,
391
+ "eval_f1_macro": 0.6881053152313114,
392
+ "eval_f1_micro": 0.8270608813406306,
393
+ "eval_loss": 0.10082241147756577,
394
+ "eval_roc_auc": 0.8915954736236973,
395
+ "eval_runtime": 682.9894,
396
+ "eval_samples_per_second": 4.196,
397
+ "eval_steps_per_second": 0.132,
398
+ "learning_rate": 0.001,
399
+ "step": 6576
400
+ },
401
+ {
402
+ "epoch": 25.0,
403
+ "eval_accuracy": 0.5268667131891137,
404
+ "eval_f1_macro": 0.6859679989625925,
405
+ "eval_f1_micro": 0.8283316086006668,
406
+ "eval_loss": 0.09828384965658188,
407
+ "eval_roc_auc": 0.8862551588199984,
408
+ "eval_runtime": 673.7378,
409
+ "eval_samples_per_second": 4.254,
410
+ "eval_steps_per_second": 0.134,
411
+ "learning_rate": 0.001,
412
+ "step": 6850
413
+ },
414
+ {
415
+ "epoch": 25.55,
416
+ "learning_rate": 0.001,
417
+ "loss": 0.1166,
418
+ "step": 7000
419
+ },
420
+ {
421
+ "epoch": 26.0,
422
+ "eval_accuracy": 0.5310537334263782,
423
+ "eval_f1_macro": 0.6805616825898689,
424
+ "eval_f1_micro": 0.8284274424464553,
425
+ "eval_loss": 0.09853371977806091,
426
+ "eval_roc_auc": 0.8875551335725609,
427
+ "eval_runtime": 679.0226,
428
+ "eval_samples_per_second": 4.221,
429
+ "eval_steps_per_second": 0.133,
430
+ "learning_rate": 0.001,
431
+ "step": 7124
432
+ },
433
+ {
434
+ "epoch": 27.0,
435
+ "eval_accuracy": 0.5324494068387997,
436
+ "eval_f1_macro": 0.6901040821549612,
437
+ "eval_f1_micro": 0.8305464575073264,
438
+ "eval_loss": 0.09571811556816101,
439
+ "eval_roc_auc": 0.887615396252071,
440
+ "eval_runtime": 672.1908,
441
+ "eval_samples_per_second": 4.264,
442
+ "eval_steps_per_second": 0.134,
443
+ "learning_rate": 0.001,
444
+ "step": 7398
445
+ },
446
+ {
447
+ "epoch": 27.37,
448
+ "learning_rate": 0.001,
449
+ "loss": 0.1158,
450
+ "step": 7500
451
+ },
452
+ {
453
+ "epoch": 28.0,
454
+ "eval_accuracy": 0.5177948360083741,
455
+ "eval_f1_macro": 0.7054421966314011,
456
+ "eval_f1_micro": 0.8292325882551658,
457
+ "eval_loss": 0.09954769909381866,
458
+ "eval_roc_auc": 0.8934873150088631,
459
+ "eval_runtime": 681.7881,
460
+ "eval_samples_per_second": 4.204,
461
+ "eval_steps_per_second": 0.132,
462
+ "learning_rate": 0.001,
463
+ "step": 7672
464
+ },
465
+ {
466
+ "epoch": 29.0,
467
+ "eval_accuracy": 0.5334961618981159,
468
+ "eval_f1_macro": 0.7026467347883069,
469
+ "eval_f1_micro": 0.8363861804655357,
470
+ "eval_loss": 0.09332505613565445,
471
+ "eval_roc_auc": 0.8970916521216963,
472
+ "eval_runtime": 684.6938,
473
+ "eval_samples_per_second": 4.186,
474
+ "eval_steps_per_second": 0.131,
475
+ "learning_rate": 0.001,
476
+ "step": 7946
477
+ },
478
+ {
479
+ "epoch": 29.2,
480
+ "learning_rate": 0.001,
481
+ "loss": 0.114,
482
+ "step": 8000
483
+ },
484
+ {
485
+ "epoch": 30.0,
486
+ "eval_accuracy": 0.5258199581297976,
487
+ "eval_f1_macro": 0.7109768073155117,
488
+ "eval_f1_micro": 0.8351258454374099,
489
+ "eval_loss": 0.09473367780447006,
490
+ "eval_roc_auc": 0.901874991489952,
491
+ "eval_runtime": 677.918,
492
+ "eval_samples_per_second": 4.228,
493
+ "eval_steps_per_second": 0.133,
494
+ "learning_rate": 0.001,
495
+ "step": 8220
496
+ },
497
+ {
498
+ "epoch": 31.0,
499
+ "eval_accuracy": 0.5331472435450104,
500
+ "eval_f1_macro": 0.7175382540523837,
501
+ "eval_f1_micro": 0.8365119611950171,
502
+ "eval_loss": 0.09674925357103348,
503
+ "eval_roc_auc": 0.9045853985344947,
504
+ "eval_runtime": 675.0971,
505
+ "eval_samples_per_second": 4.245,
506
+ "eval_steps_per_second": 0.133,
507
+ "learning_rate": 0.001,
508
+ "step": 8494
509
+ },
510
+ {
511
+ "epoch": 31.02,
512
+ "learning_rate": 0.001,
513
+ "loss": 0.1134,
514
+ "step": 8500
515
+ },
516
+ {
517
+ "epoch": 32.0,
518
+ "eval_accuracy": 0.5324494068387997,
519
+ "eval_f1_macro": 0.6932594476375145,
520
+ "eval_f1_micro": 0.8353541076487252,
521
+ "eval_loss": 0.09490892291069031,
522
+ "eval_roc_auc": 0.8947967085095635,
523
+ "eval_runtime": 677.2028,
524
+ "eval_samples_per_second": 4.232,
525
+ "eval_steps_per_second": 0.133,
526
+ "learning_rate": 0.001,
527
+ "step": 8768
528
+ },
529
+ {
530
+ "epoch": 32.85,
531
+ "learning_rate": 0.001,
532
+ "loss": 0.113,
533
+ "step": 9000
534
+ },
535
+ {
536
+ "epoch": 33.0,
537
+ "eval_accuracy": 0.5362875087229588,
538
+ "eval_f1_macro": 0.6973292248077614,
539
+ "eval_f1_micro": 0.8367208672086721,
540
+ "eval_loss": 0.09511947631835938,
541
+ "eval_roc_auc": 0.8966987186810037,
542
+ "eval_runtime": 679.2015,
543
+ "eval_samples_per_second": 4.22,
544
+ "eval_steps_per_second": 0.133,
545
+ "learning_rate": 0.001,
546
+ "step": 9042
547
+ },
548
+ {
549
+ "epoch": 34.0,
550
+ "eval_accuracy": 0.5380321004884857,
551
+ "eval_f1_macro": 0.6878227037845351,
552
+ "eval_f1_micro": 0.8334680679062246,
553
+ "eval_loss": 0.09364539384841919,
554
+ "eval_roc_auc": 0.8876181367760314,
555
+ "eval_runtime": 675.0346,
556
+ "eval_samples_per_second": 4.246,
557
+ "eval_steps_per_second": 0.133,
558
+ "learning_rate": 0.001,
559
+ "step": 9316
560
+ },
561
+ {
562
+ "epoch": 34.67,
563
+ "learning_rate": 0.001,
564
+ "loss": 0.1124,
565
+ "step": 9500
566
+ },
567
+ {
568
+ "epoch": 35.0,
569
+ "eval_accuracy": 0.5310537334263782,
570
+ "eval_f1_macro": 0.6856042645068489,
571
+ "eval_f1_micro": 0.833974649162517,
572
+ "eval_loss": 0.09358564764261246,
573
+ "eval_roc_auc": 0.8944494841237697,
574
+ "eval_runtime": 678.2782,
575
+ "eval_samples_per_second": 4.225,
576
+ "eval_steps_per_second": 0.133,
577
+ "learning_rate": 0.001,
578
+ "step": 9590
579
+ },
580
+ {
581
+ "epoch": 36.0,
582
+ "eval_accuracy": 0.5453593859036985,
583
+ "eval_f1_macro": 0.729828782855425,
584
+ "eval_f1_micro": 0.8455960879096174,
585
+ "eval_loss": 0.09342356771230698,
586
+ "eval_roc_auc": 0.9030647539078717,
587
+ "eval_runtime": 674.7512,
588
+ "eval_samples_per_second": 4.247,
589
+ "eval_steps_per_second": 0.133,
590
+ "learning_rate": 0.0001,
591
+ "step": 9864
592
+ },
593
+ {
594
+ "epoch": 36.5,
595
+ "learning_rate": 0.0001,
596
+ "loss": 0.1083,
597
+ "step": 10000
598
+ },
599
+ {
600
+ "epoch": 37.0,
601
+ "eval_accuracy": 0.54675505931612,
602
+ "eval_f1_macro": 0.7188765655113909,
603
+ "eval_f1_micro": 0.8456650022696323,
604
+ "eval_loss": 0.09240464121103287,
605
+ "eval_roc_auc": 0.8999478550409371,
606
+ "eval_runtime": 674.676,
607
+ "eval_samples_per_second": 4.248,
608
+ "eval_steps_per_second": 0.133,
609
+ "learning_rate": 0.0001,
610
+ "step": 10138
611
+ },
612
+ {
613
+ "epoch": 38.0,
614
+ "eval_accuracy": 0.5450104675505931,
615
+ "eval_f1_macro": 0.7089159960142193,
616
+ "eval_f1_micro": 0.8449173647271904,
617
+ "eval_loss": 0.09147636592388153,
618
+ "eval_roc_auc": 0.9003695495594045,
619
+ "eval_runtime": 675.4085,
620
+ "eval_samples_per_second": 4.243,
621
+ "eval_steps_per_second": 0.133,
622
+ "learning_rate": 0.0001,
623
+ "step": 10412
624
+ },
625
+ {
626
+ "epoch": 38.32,
627
+ "learning_rate": 0.0001,
628
+ "loss": 0.1034,
629
+ "step": 10500
630
+ },
631
+ {
632
+ "epoch": 39.0,
633
+ "eval_accuracy": 0.5484996510816469,
634
+ "eval_f1_macro": 0.725215575661352,
635
+ "eval_f1_micro": 0.8487853799866281,
636
+ "eval_loss": 0.09022974222898483,
637
+ "eval_roc_auc": 0.9078051247451889,
638
+ "eval_runtime": 672.7306,
639
+ "eval_samples_per_second": 4.26,
640
+ "eval_steps_per_second": 0.134,
641
+ "learning_rate": 0.0001,
642
+ "step": 10686
643
+ },
644
+ {
645
+ "epoch": 40.0,
646
+ "eval_accuracy": 0.5495464061409631,
647
+ "eval_f1_macro": 0.7182446688615595,
648
+ "eval_f1_micro": 0.8458797579322437,
649
+ "eval_loss": 0.09058264642953873,
650
+ "eval_roc_auc": 0.9011424061638826,
651
+ "eval_runtime": 678.9298,
652
+ "eval_samples_per_second": 4.221,
653
+ "eval_steps_per_second": 0.133,
654
+ "learning_rate": 0.0001,
655
+ "step": 10960
656
+ },
657
+ {
658
+ "epoch": 40.15,
659
+ "learning_rate": 0.0001,
660
+ "loss": 0.1024,
661
+ "step": 11000
662
+ },
663
+ {
664
+ "epoch": 41.0,
665
+ "eval_accuracy": 0.5505931612002791,
666
+ "eval_f1_macro": 0.7130026819185953,
667
+ "eval_f1_micro": 0.8481005491705826,
668
+ "eval_loss": 0.08943015336990356,
669
+ "eval_roc_auc": 0.902018393012137,
670
+ "eval_runtime": 676.4356,
671
+ "eval_samples_per_second": 4.237,
672
+ "eval_steps_per_second": 0.133,
673
+ "learning_rate": 0.0001,
674
+ "step": 11234
675
+ },
676
+ {
677
+ "epoch": 41.97,
678
+ "learning_rate": 0.0001,
679
+ "loss": 0.1004,
680
+ "step": 11500
681
+ },
682
+ {
683
+ "epoch": 42.0,
684
+ "eval_accuracy": 0.5519888346127007,
685
+ "eval_f1_macro": 0.7148190184347656,
686
+ "eval_f1_micro": 0.8457012282205084,
687
+ "eval_loss": 0.08726447820663452,
688
+ "eval_roc_auc": 0.8977194051943719,
689
+ "eval_runtime": 677.9496,
690
+ "eval_samples_per_second": 4.227,
691
+ "eval_steps_per_second": 0.133,
692
+ "learning_rate": 0.0001,
693
+ "step": 11508
694
+ },
695
+ {
696
+ "epoch": 43.0,
697
+ "eval_accuracy": 0.5537334263782275,
698
+ "eval_f1_macro": 0.71816969331258,
699
+ "eval_f1_micro": 0.8494563389754511,
700
+ "eval_loss": 0.08699071407318115,
701
+ "eval_roc_auc": 0.906163507621426,
702
+ "eval_runtime": 676.5268,
703
+ "eval_samples_per_second": 4.236,
704
+ "eval_steps_per_second": 0.133,
705
+ "learning_rate": 0.0001,
706
+ "step": 11782
707
+ },
708
+ {
709
+ "epoch": 43.8,
710
+ "learning_rate": 0.0001,
711
+ "loss": 0.0998,
712
+ "step": 12000
713
+ },
714
+ {
715
+ "epoch": 44.0,
716
+ "eval_accuracy": 0.5498953244940684,
717
+ "eval_f1_macro": 0.7261208407998851,
718
+ "eval_f1_micro": 0.8486114247008355,
719
+ "eval_loss": 0.08676985651254654,
720
+ "eval_roc_auc": 0.9033042612782081,
721
+ "eval_runtime": 674.6536,
722
+ "eval_samples_per_second": 4.248,
723
+ "eval_steps_per_second": 0.133,
724
+ "learning_rate": 0.0001,
725
+ "step": 12056
726
+ },
727
+ {
728
+ "epoch": 45.0,
729
+ "eval_accuracy": 0.555129099790649,
730
+ "eval_f1_macro": 0.7235580263821535,
731
+ "eval_f1_micro": 0.8493258426966293,
732
+ "eval_loss": 0.08680889010429382,
733
+ "eval_roc_auc": 0.9052931027877648,
734
+ "eval_runtime": 688.2557,
735
+ "eval_samples_per_second": 4.164,
736
+ "eval_steps_per_second": 0.131,
737
+ "learning_rate": 0.0001,
738
+ "step": 12330
739
+ },
740
+ {
741
+ "epoch": 45.62,
742
+ "learning_rate": 0.0001,
743
+ "loss": 0.0975,
744
+ "step": 12500
745
+ },
746
+ {
747
+ "epoch": 46.0,
748
+ "eval_accuracy": 0.5512909979064898,
749
+ "eval_f1_macro": 0.7317716716296281,
750
+ "eval_f1_micro": 0.8489586241554526,
751
+ "eval_loss": 0.0865492969751358,
752
+ "eval_roc_auc": 0.9071607698668371,
753
+ "eval_runtime": 684.0678,
754
+ "eval_samples_per_second": 4.19,
755
+ "eval_steps_per_second": 0.132,
756
+ "learning_rate": 0.0001,
757
+ "step": 12604
758
+ },
759
+ {
760
+ "epoch": 47.0,
761
+ "eval_accuracy": 0.5547801814375436,
762
+ "eval_f1_macro": 0.7390020274567815,
763
+ "eval_f1_micro": 0.8512299882858259,
764
+ "eval_loss": 0.08599700033664703,
765
+ "eval_roc_auc": 0.908765101440927,
766
+ "eval_runtime": 688.2365,
767
+ "eval_samples_per_second": 4.164,
768
+ "eval_steps_per_second": 0.131,
769
+ "learning_rate": 0.0001,
770
+ "step": 12878
771
+ },
772
+ {
773
+ "epoch": 47.45,
774
+ "learning_rate": 0.0001,
775
+ "loss": 0.099,
776
+ "step": 13000
777
+ },
778
+ {
779
+ "epoch": 48.0,
780
+ "eval_accuracy": 0.5558269364968598,
781
+ "eval_f1_macro": 0.7360003523455093,
782
+ "eval_f1_micro": 0.8509512552065742,
783
+ "eval_loss": 0.08596429973840714,
784
+ "eval_roc_auc": 0.9055422308395834,
785
+ "eval_runtime": 686.3198,
786
+ "eval_samples_per_second": 4.176,
787
+ "eval_steps_per_second": 0.131,
788
+ "learning_rate": 0.0001,
789
+ "step": 13152
790
+ },
791
+ {
792
+ "epoch": 49.0,
793
+ "eval_accuracy": 0.5547801814375436,
794
+ "eval_f1_macro": 0.7361919298080869,
795
+ "eval_f1_micro": 0.849985959000281,
796
+ "eval_loss": 0.08584348857402802,
797
+ "eval_roc_auc": 0.9057525299940252,
798
+ "eval_runtime": 688.3813,
799
+ "eval_samples_per_second": 4.163,
800
+ "eval_steps_per_second": 0.131,
801
+ "learning_rate": 0.0001,
802
+ "step": 13426
803
+ },
804
+ {
805
+ "epoch": 49.27,
806
+ "learning_rate": 0.0001,
807
+ "loss": 0.0972,
808
+ "step": 13500
809
+ },
810
+ {
811
+ "epoch": 50.0,
812
+ "eval_accuracy": 0.5586182833217027,
813
+ "eval_f1_macro": 0.725712332481399,
814
+ "eval_f1_micro": 0.8505096262740656,
815
+ "eval_loss": 0.08557379245758057,
816
+ "eval_roc_auc": 0.9032805341102342,
817
+ "eval_runtime": 685.8179,
818
+ "eval_samples_per_second": 4.179,
819
+ "eval_steps_per_second": 0.131,
820
+ "learning_rate": 0.0001,
821
+ "step": 13700
822
+ },
823
+ {
824
+ "epoch": 51.0,
825
+ "eval_accuracy": 0.557920446615492,
826
+ "eval_f1_macro": 0.7408593608052999,
827
+ "eval_f1_micro": 0.8500254194204373,
828
+ "eval_loss": 0.08562461286783218,
829
+ "eval_roc_auc": 0.9038335718454608,
830
+ "eval_runtime": 683.3234,
831
+ "eval_samples_per_second": 4.194,
832
+ "eval_steps_per_second": 0.132,
833
+ "learning_rate": 0.0001,
834
+ "step": 13974
835
+ },
836
+ {
837
+ "epoch": 51.09,
838
+ "learning_rate": 0.0001,
839
+ "loss": 0.0957,
840
+ "step": 14000
841
+ },
842
+ {
843
+ "epoch": 52.0,
844
+ "eval_accuracy": 0.5568736915561758,
845
+ "eval_f1_macro": 0.7232142709265429,
846
+ "eval_f1_micro": 0.8507868221442318,
847
+ "eval_loss": 0.08591117709875107,
848
+ "eval_roc_auc": 0.9035466101391771,
849
+ "eval_runtime": 693.4248,
850
+ "eval_samples_per_second": 4.133,
851
+ "eval_steps_per_second": 0.13,
852
+ "learning_rate": 0.0001,
853
+ "step": 14248
854
+ },
855
+ {
856
+ "epoch": 52.92,
857
+ "learning_rate": 0.0001,
858
+ "loss": 0.0964,
859
+ "step": 14500
860
+ },
861
+ {
862
+ "epoch": 53.0,
863
+ "eval_accuracy": 0.5628053035589672,
864
+ "eval_f1_macro": 0.7275870481420489,
865
+ "eval_f1_micro": 0.852056338028169,
866
+ "eval_loss": 0.08490145951509476,
867
+ "eval_roc_auc": 0.9058454914515268,
868
+ "eval_runtime": 691.5127,
869
+ "eval_samples_per_second": 4.145,
870
+ "eval_steps_per_second": 0.13,
871
+ "learning_rate": 0.0001,
872
+ "step": 14522
873
+ },
874
+ {
875
+ "epoch": 54.0,
876
+ "eval_accuracy": 0.5537334263782275,
877
+ "eval_f1_macro": 0.7394514344990791,
878
+ "eval_f1_micro": 0.85390386218394,
879
+ "eval_loss": 0.08516541868448257,
880
+ "eval_roc_auc": 0.9115532672468961,
881
+ "eval_runtime": 698.0325,
882
+ "eval_samples_per_second": 4.106,
883
+ "eval_steps_per_second": 0.129,
884
+ "learning_rate": 0.0001,
885
+ "step": 14796
886
+ },
887
+ {
888
+ "epoch": 54.74,
889
+ "learning_rate": 0.0001,
890
+ "loss": 0.0955,
891
+ "step": 15000
892
+ },
893
+ {
894
+ "epoch": 55.0,
895
+ "eval_accuracy": 0.5565247732030705,
896
+ "eval_f1_macro": 0.7354184764103003,
897
+ "eval_f1_micro": 0.8511167656205825,
898
+ "eval_loss": 0.08514942973852158,
899
+ "eval_roc_auc": 0.904089479088129,
900
+ "eval_runtime": 681.3519,
901
+ "eval_samples_per_second": 4.206,
902
+ "eval_steps_per_second": 0.132,
903
+ "learning_rate": 0.0001,
904
+ "step": 15070
905
+ },
906
+ {
907
+ "epoch": 56.0,
908
+ "eval_accuracy": 0.5572226099092812,
909
+ "eval_f1_macro": 0.736739641327092,
910
+ "eval_f1_micro": 0.8529461421576904,
911
+ "eval_loss": 0.08491206169128418,
912
+ "eval_roc_auc": 0.9066984002032717,
913
+ "eval_runtime": 677.9791,
914
+ "eval_samples_per_second": 4.227,
915
+ "eval_steps_per_second": 0.133,
916
+ "learning_rate": 0.0001,
917
+ "step": 15344
918
+ },
919
+ {
920
+ "epoch": 56.57,
921
+ "learning_rate": 0.0001,
922
+ "loss": 0.095,
923
+ "step": 15500
924
+ },
925
+ {
926
+ "epoch": 57.0,
927
+ "eval_accuracy": 0.5537334263782275,
928
+ "eval_f1_macro": 0.7241830253482859,
929
+ "eval_f1_micro": 0.8493824336688013,
930
+ "eval_loss": 0.0848437026143074,
931
+ "eval_roc_auc": 0.8993941682342463,
932
+ "eval_runtime": 677.1333,
933
+ "eval_samples_per_second": 4.233,
934
+ "eval_steps_per_second": 0.133,
935
+ "learning_rate": 0.0001,
936
+ "step": 15618
937
+ },
938
+ {
939
+ "epoch": 58.0,
940
+ "eval_accuracy": 0.5593161200279134,
941
+ "eval_f1_macro": 0.7363418087082886,
942
+ "eval_f1_micro": 0.8511604153662826,
943
+ "eval_loss": 0.08454328030347824,
944
+ "eval_roc_auc": 0.9029315644433922,
945
+ "eval_runtime": 675.3593,
946
+ "eval_samples_per_second": 4.244,
947
+ "eval_steps_per_second": 0.133,
948
+ "learning_rate": 0.0001,
949
+ "step": 15892
950
+ },
951
+ {
952
+ "epoch": 58.39,
953
+ "learning_rate": 0.0001,
954
+ "loss": 0.093,
955
+ "step": 16000
956
+ },
957
+ {
958
+ "epoch": 59.0,
959
+ "eval_accuracy": 0.560711793440335,
960
+ "eval_f1_macro": 0.73901392865669,
961
+ "eval_f1_micro": 0.8530955471527739,
962
+ "eval_loss": 0.08396653085947037,
963
+ "eval_roc_auc": 0.9058246057741859,
964
+ "eval_runtime": 679.8275,
965
+ "eval_samples_per_second": 4.216,
966
+ "eval_steps_per_second": 0.132,
967
+ "learning_rate": 0.0001,
968
+ "step": 16166
969
+ },
970
+ {
971
+ "epoch": 60.0,
972
+ "eval_accuracy": 0.5561758548499651,
973
+ "eval_f1_macro": 0.7472770304573509,
974
+ "eval_f1_micro": 0.852848189028787,
975
+ "eval_loss": 0.08474517613649368,
976
+ "eval_roc_auc": 0.9116141789978706,
977
+ "eval_runtime": 679.8207,
978
+ "eval_samples_per_second": 4.216,
979
+ "eval_steps_per_second": 0.132,
980
+ "learning_rate": 0.0001,
981
+ "step": 16440
982
+ },
983
+ {
984
+ "epoch": 60.22,
985
+ "learning_rate": 0.0001,
986
+ "loss": 0.0936,
987
+ "step": 16500
988
+ },
989
+ {
990
+ "epoch": 61.0,
991
+ "eval_accuracy": 0.552337752965806,
992
+ "eval_f1_macro": 0.7425280881449604,
993
+ "eval_f1_micro": 0.8516569637259293,
994
+ "eval_loss": 0.08434043824672699,
995
+ "eval_roc_auc": 0.9078005379735077,
996
+ "eval_runtime": 678.8837,
997
+ "eval_samples_per_second": 4.222,
998
+ "eval_steps_per_second": 0.133,
999
+ "learning_rate": 0.0001,
1000
+ "step": 16714
1001
+ },
1002
+ {
1003
+ "epoch": 62.0,
1004
+ "eval_accuracy": 0.5540823447313329,
1005
+ "eval_f1_macro": 0.7455853496732745,
1006
+ "eval_f1_micro": 0.8515365097265295,
1007
+ "eval_loss": 0.08436089754104614,
1008
+ "eval_roc_auc": 0.905273313320008,
1009
+ "eval_runtime": 684.8061,
1010
+ "eval_samples_per_second": 4.185,
1011
+ "eval_steps_per_second": 0.131,
1012
+ "learning_rate": 0.0001,
1013
+ "step": 16988
1014
+ },
1015
+ {
1016
+ "epoch": 62.04,
1017
+ "learning_rate": 0.0001,
1018
+ "loss": 0.0932,
1019
+ "step": 17000
1020
+ },
1021
+ {
1022
+ "epoch": 63.0,
1023
+ "eval_accuracy": 0.5575715282623867,
1024
+ "eval_f1_macro": 0.7344319075168565,
1025
+ "eval_f1_micro": 0.8535319341006545,
1026
+ "eval_loss": 0.0839960053563118,
1027
+ "eval_roc_auc": 0.9061600170593289,
1028
+ "eval_runtime": 676.8011,
1029
+ "eval_samples_per_second": 4.235,
1030
+ "eval_steps_per_second": 0.133,
1031
+ "learning_rate": 0.0001,
1032
+ "step": 17262
1033
+ },
1034
+ {
1035
+ "epoch": 63.87,
1036
+ "learning_rate": 0.0001,
1037
+ "loss": 0.0933,
1038
+ "step": 17500
1039
+ },
1040
+ {
1041
+ "epoch": 64.0,
1042
+ "eval_accuracy": 0.5614096301465457,
1043
+ "eval_f1_macro": 0.7405199466064576,
1044
+ "eval_f1_micro": 0.8543109759531453,
1045
+ "eval_loss": 0.08395348489284515,
1046
+ "eval_roc_auc": 0.907220383474883,
1047
+ "eval_runtime": 676.0097,
1048
+ "eval_samples_per_second": 4.24,
1049
+ "eval_steps_per_second": 0.133,
1050
+ "learning_rate": 0.0001,
1051
+ "step": 17536
1052
+ },
1053
+ {
1054
+ "epoch": 65.0,
1055
+ "eval_accuracy": 0.557920446615492,
1056
+ "eval_f1_macro": 0.7354221702015719,
1057
+ "eval_f1_micro": 0.8506689439225733,
1058
+ "eval_loss": 0.08403661847114563,
1059
+ "eval_roc_auc": 0.9015583167344123,
1060
+ "eval_runtime": 675.0487,
1061
+ "eval_samples_per_second": 4.246,
1062
+ "eval_steps_per_second": 0.133,
1063
+ "learning_rate": 0.0001,
1064
+ "step": 17810
1065
+ },
1066
+ {
1067
+ "epoch": 65.69,
1068
+ "learning_rate": 0.0001,
1069
+ "loss": 0.0921,
1070
+ "step": 18000
1071
+ },
1072
+ {
1073
+ "epoch": 66.0,
1074
+ "eval_accuracy": 0.5568736915561758,
1075
+ "eval_f1_macro": 0.7296578358578595,
1076
+ "eval_f1_micro": 0.852865023077789,
1077
+ "eval_loss": 0.08408054709434509,
1078
+ "eval_roc_auc": 0.9065963661690798,
1079
+ "eval_runtime": 680.0166,
1080
+ "eval_samples_per_second": 4.215,
1081
+ "eval_steps_per_second": 0.132,
1082
+ "learning_rate": 0.0001,
1083
+ "step": 18084
1084
+ },
1085
+ {
1086
+ "epoch": 67.0,
1087
+ "eval_accuracy": 0.5540823447313329,
1088
+ "eval_f1_macro": 0.7392975848141861,
1089
+ "eval_f1_micro": 0.8539689628223736,
1090
+ "eval_loss": 0.08376849442720413,
1091
+ "eval_roc_auc": 0.9100385075348831,
1092
+ "eval_runtime": 675.941,
1093
+ "eval_samples_per_second": 4.24,
1094
+ "eval_steps_per_second": 0.133,
1095
+ "learning_rate": 0.0001,
1096
+ "step": 18358
1097
+ },
1098
+ {
1099
+ "epoch": 67.52,
1100
+ "learning_rate": 0.0001,
1101
+ "loss": 0.0913,
1102
+ "step": 18500
1103
+ },
1104
+ {
1105
+ "epoch": 68.0,
1106
+ "eval_accuracy": 0.5572226099092812,
1107
+ "eval_f1_macro": 0.7403483881006915,
1108
+ "eval_f1_micro": 0.854102492299076,
1109
+ "eval_loss": 0.08355987071990967,
1110
+ "eval_roc_auc": 0.9089826269243382,
1111
+ "eval_runtime": 676.7975,
1112
+ "eval_samples_per_second": 4.235,
1113
+ "eval_steps_per_second": 0.133,
1114
+ "learning_rate": 0.0001,
1115
+ "step": 18632
1116
+ },
1117
+ {
1118
+ "epoch": 69.0,
1119
+ "eval_accuracy": 0.5582693649685974,
1120
+ "eval_f1_macro": 0.7494443807338856,
1121
+ "eval_f1_micro": 0.8547792062604807,
1122
+ "eval_loss": 0.08346723765134811,
1123
+ "eval_roc_auc": 0.9100283387486087,
1124
+ "eval_runtime": 675.2406,
1125
+ "eval_samples_per_second": 4.244,
1126
+ "eval_steps_per_second": 0.133,
1127
+ "learning_rate": 0.0001,
1128
+ "step": 18906
1129
+ },
1130
+ {
1131
+ "epoch": 69.34,
1132
+ "learning_rate": 0.0001,
1133
+ "loss": 0.0911,
1134
+ "step": 19000
1135
+ },
1136
+ {
1137
+ "epoch": 70.0,
1138
+ "eval_accuracy": 0.5561758548499651,
1139
+ "eval_f1_macro": 0.7486606655073544,
1140
+ "eval_f1_micro": 0.8551793496480055,
1141
+ "eval_loss": 0.08306006342172623,
1142
+ "eval_roc_auc": 0.9104037761073852,
1143
+ "eval_runtime": 675.6626,
1144
+ "eval_samples_per_second": 4.242,
1145
+ "eval_steps_per_second": 0.133,
1146
+ "learning_rate": 0.0001,
1147
+ "step": 19180
1148
+ },
1149
+ {
1150
+ "epoch": 70.0,
1151
+ "learning_rate": 0.0001,
1152
+ "step": 19180,
1153
+ "total_flos": 9.099793269879256e+20,
1154
+ "train_loss": 0.11672632308896316,
1155
+ "train_runtime": 200748.2354,
1156
+ "train_samples_per_second": 3.057,
1157
+ "train_steps_per_second": 0.096
1158
+ }
1159
+ ],
1160
+ "logging_steps": 500,
1161
+ "max_steps": 19180,
1162
+ "num_train_epochs": 70,
1163
+ "save_steps": 500,
1164
+ "total_flos": 9.099793269879256e+20,
1165
+ "trial_name": null,
1166
+ "trial_params": null
1167
+ }