RylanSchaeffer
commited on
Commit
•
08035f8
1
Parent(s):
94dd4da
End of training
Browse files- README.md +270 -269
- model-00001-of-00002.safetensors +1 -1
- model-00002-of-00002.safetensors +1 -1
- training_args.bin +1 -1
README.md
CHANGED
@@ -17,8 +17,8 @@ should probably proofread and complete it, then remove this comment. -->
|
|
17 |
|
18 |
This model is a fine-tuned version of [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) on an unknown dataset.
|
19 |
It achieves the following results on the evaluation set:
|
20 |
-
- Loss: 1.
|
21 |
-
- Num Input Tokens Seen:
|
22 |
|
23 |
## Model description
|
24 |
|
@@ -53,273 +53,274 @@ The following hyperparameters were used during training:
|
|
53 |
| Training Loss | Epoch | Step | Validation Loss | Input Tokens Seen |
|
54 |
|:-------------:|:------:|:----:|:---------------:|:-----------------:|
|
55 |
| No log | 0 | 0 | 1.3909 | 0 |
|
56 |
-
| 1.
|
57 |
-
| 1.
|
58 |
-
| 1.
|
59 |
-
| 1.
|
60 |
-
| 1.
|
61 |
-
| 1.
|
62 |
-
| 1.
|
63 |
-
| 1.
|
64 |
-
| 1.
|
65 |
-
| 0.
|
66 |
-
| 0.
|
67 |
-
| 0.
|
68 |
-
| 0.
|
69 |
-
| 0.
|
70 |
-
| 0.
|
71 |
-
| 0.
|
72 |
-
| 0.
|
73 |
-
| 0.
|
74 |
-
| 0.
|
75 |
-
| 0.
|
76 |
-
| 0.
|
77 |
-
| 0.
|
78 |
-
| 0.
|
79 |
-
| 0.
|
80 |
-
| 0.
|
81 |
-
| 0.
|
82 |
-
| 0.
|
83 |
-
| 0.
|
84 |
-
| 0.
|
85 |
-
| 0.
|
86 |
-
| 0.
|
87 |
-
| 0.
|
88 |
-
| 0.
|
89 |
-
| 0.
|
90 |
-
| 0.
|
91 |
-
| 0.
|
92 |
-
| 0.
|
93 |
-
| 0.
|
94 |
-
| 0.
|
95 |
-
| 0.
|
96 |
-
| 0.
|
97 |
-
| 0.
|
98 |
-
| 0.
|
99 |
-
| 0.
|
100 |
-
| 0.
|
101 |
-
| 0.
|
102 |
-
| 0.
|
103 |
-
| 0.
|
104 |
-
| 0.
|
105 |
-
| 0.
|
106 |
-
| 0.
|
107 |
-
| 0.
|
108 |
-
| 0.
|
109 |
-
| 0.
|
110 |
-
| 0.
|
111 |
-
| 0.
|
112 |
-
| 0.
|
113 |
-
| 0.
|
114 |
-
| 0.
|
115 |
-
| 0.
|
116 |
-
| 0.
|
117 |
-
| 0.
|
118 |
-
| 0.
|
119 |
-
| 0.
|
120 |
-
| 0.
|
121 |
-
| 0.
|
122 |
-
| 0.
|
123 |
-
| 0.
|
124 |
-
| 0.
|
125 |
-
| 0.
|
126 |
-
| 0.
|
127 |
-
| 0.
|
128 |
-
| 0.
|
129 |
-
| 0.
|
130 |
-
| 0.
|
131 |
-
| 0.
|
132 |
-
| 0.
|
133 |
-
| 0.
|
134 |
-
| 0.
|
135 |
-
| 0.
|
136 |
-
| 0.
|
137 |
-
| 0.
|
138 |
-
| 0.
|
139 |
-
| 0.
|
140 |
-
| 0.
|
141 |
-
| 0.
|
142 |
-
| 0.
|
143 |
-
| 0.
|
144 |
-
| 0.
|
145 |
-
| 0.
|
146 |
-
| 0.
|
147 |
-
| 0.
|
148 |
-
| 0.
|
149 |
-
| 0.
|
150 |
-
| 0.
|
151 |
-
| 0.
|
152 |
-
| 0.
|
153 |
-
| 0.
|
154 |
-
| 0.
|
155 |
-
| 0.
|
156 |
-
| 0.
|
157 |
-
| 0.
|
158 |
-
| 0.
|
159 |
-
| 0.
|
160 |
-
| 0.
|
161 |
-
| 0.
|
162 |
-
| 0.
|
163 |
-
| 0.
|
164 |
-
| 0.
|
165 |
-
| 0.
|
166 |
-
| 0.
|
167 |
-
| 0.
|
168 |
-
| 0.
|
169 |
-
| 0.
|
170 |
-
| 0.
|
171 |
-
| 0.
|
172 |
-
| 0.
|
173 |
-
| 0.
|
174 |
-
| 0.
|
175 |
-
| 0.
|
176 |
-
| 0.
|
177 |
-
| 0.
|
178 |
-
| 0.
|
179 |
-
| 0.
|
180 |
-
| 0.
|
181 |
-
| 0.
|
182 |
-
| 0.
|
183 |
-
| 0.
|
184 |
-
| 0.
|
185 |
-
| 0.
|
186 |
-
| 0.
|
187 |
-
| 0.
|
188 |
-
| 0.
|
189 |
-
| 0.
|
190 |
-
| 0.
|
191 |
-
| 0.
|
192 |
-
| 0.
|
193 |
-
| 0.
|
194 |
-
| 0.
|
195 |
-
| 0.
|
196 |
-
| 0.
|
197 |
-
| 0.
|
198 |
-
| 0.
|
199 |
-
| 0.
|
200 |
-
| 0.
|
201 |
-
| 0.
|
202 |
-
| 0.
|
203 |
-
| 0.
|
204 |
-
| 0.
|
205 |
-
| 0.
|
206 |
-
| 0.
|
207 |
-
| 0.
|
208 |
-
| 0.
|
209 |
-
| 0.
|
210 |
-
| 0.
|
211 |
-
| 0.
|
212 |
-
| 0.
|
213 |
-
| 0.
|
214 |
-
| 0.
|
215 |
-
| 0.
|
216 |
-
| 0.
|
217 |
-
| 0.
|
218 |
-
| 0.
|
219 |
-
| 0.
|
220 |
-
| 0.
|
221 |
-
| 0.
|
222 |
-
| 0.
|
223 |
-
| 0.
|
224 |
-
| 0.
|
225 |
-
| 0.
|
226 |
-
| 0.
|
227 |
-
| 0.
|
228 |
-
| 0.
|
229 |
-
| 0.
|
230 |
-
| 0.
|
231 |
-
| 0.
|
232 |
-
| 0.
|
233 |
-
| 0.
|
234 |
-
| 0.
|
235 |
-
| 0.
|
236 |
-
| 0.
|
237 |
-
| 0.
|
238 |
-
| 0.
|
239 |
-
| 0.
|
240 |
-
| 0.
|
241 |
-
| 0.
|
242 |
-
| 0.
|
243 |
-
| 0.
|
244 |
-
| 0.
|
245 |
-
| 0.
|
246 |
-
| 0.
|
247 |
-
| 0.
|
248 |
-
| 0.
|
249 |
-
| 0.
|
250 |
-
| 0.
|
251 |
-
| 0.
|
252 |
-
| 0.
|
253 |
-
| 0.
|
254 |
-
| 0.
|
255 |
-
| 0.
|
256 |
-
| 0.
|
257 |
-
| 0.
|
258 |
-
| 0.
|
259 |
-
| 0.
|
260 |
-
| 0.
|
261 |
-
| 0.
|
262 |
-
| 0.
|
263 |
-
| 0.
|
264 |
-
| 0.
|
265 |
-
| 0.
|
266 |
-
| 0.
|
267 |
-
| 0.
|
268 |
-
| 0.
|
269 |
-
| 0.
|
270 |
-
| 0.
|
271 |
-
| 0.
|
272 |
-
| 0.
|
273 |
-
| 0.
|
274 |
-
| 0.
|
275 |
-
| 0.
|
276 |
-
| 0.
|
277 |
-
| 0.
|
278 |
-
| 0.
|
279 |
-
| 0.
|
280 |
-
| 0.
|
281 |
-
| 0.
|
282 |
-
| 0.
|
283 |
-
| 0.
|
284 |
-
| 0.
|
285 |
-
| 0.
|
286 |
-
| 0.
|
287 |
-
| 0.
|
288 |
-
| 0.
|
289 |
-
| 0.
|
290 |
-
| 0.
|
291 |
-
| 0.
|
292 |
-
| 0.
|
293 |
-
| 0.
|
294 |
-
| 0.
|
295 |
-
| 0.
|
296 |
-
| 0.
|
297 |
-
| 0.
|
298 |
-
| 0.
|
299 |
-
| 0.
|
300 |
-
| 0.
|
301 |
-
| 0.
|
302 |
-
| 0.
|
303 |
-
| 0.
|
304 |
-
| 0.
|
305 |
-
| 0.
|
306 |
-
| 0.
|
307 |
-
| 0.
|
308 |
-
| 0.
|
309 |
-
| 0.
|
310 |
-
| 0.
|
311 |
-
| 0.
|
312 |
-
| 0.
|
313 |
-
| 0.
|
314 |
-
| 0.
|
315 |
-
| 0.
|
316 |
-
| 0.
|
317 |
-
| 0.
|
318 |
-
| 0.
|
319 |
-
| 0.
|
320 |
-
| 0.
|
321 |
-
| 0.
|
322 |
-
| 0.
|
|
|
323 |
|
324 |
|
325 |
### Framework versions
|
|
|
17 |
|
18 |
This model is a fine-tuned version of [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) on an unknown dataset.
|
19 |
It achieves the following results on the evaluation set:
|
20 |
+
- Loss: 1.1072
|
21 |
+
- Num Input Tokens Seen: 72483520
|
22 |
|
23 |
## Model description
|
24 |
|
|
|
53 |
| Training Loss | Epoch | Step | Validation Loss | Input Tokens Seen |
|
54 |
|:-------------:|:------:|:----:|:---------------:|:-----------------:|
|
55 |
| No log | 0 | 0 | 1.3909 | 0 |
|
56 |
+
| 1.6216 | 0.0037 | 5 | 1.3897 | 273928 |
|
57 |
+
| 1.5677 | 0.0075 | 10 | 1.3772 | 540496 |
|
58 |
+
| 1.5223 | 0.0112 | 15 | 1.3459 | 806024 |
|
59 |
+
| 1.4364 | 0.0149 | 20 | 1.2958 | 1080064 |
|
60 |
+
| 1.3754 | 0.0186 | 25 | 1.2548 | 1342696 |
|
61 |
+
| 1.341 | 0.0224 | 30 | 1.2272 | 1609776 |
|
62 |
+
| 1.2543 | 0.0261 | 35 | 1.1938 | 1883584 |
|
63 |
+
| 1.1445 | 0.0298 | 40 | 1.1967 | 2159576 |
|
64 |
+
| 1.0439 | 0.0336 | 45 | 1.2223 | 2427016 |
|
65 |
+
| 0.9516 | 0.0373 | 50 | 1.2217 | 2699496 |
|
66 |
+
| 0.8087 | 0.0410 | 55 | 1.2222 | 2961896 |
|
67 |
+
| 0.6456 | 0.0447 | 60 | 1.2632 | 3230456 |
|
68 |
+
| 0.5882 | 0.0485 | 65 | 1.2657 | 3495256 |
|
69 |
+
| 0.54 | 0.0522 | 70 | 1.2746 | 3766984 |
|
70 |
+
| 0.4778 | 0.0559 | 75 | 1.2396 | 4040984 |
|
71 |
+
| 0.4138 | 0.0597 | 80 | 1.2546 | 4317400 |
|
72 |
+
| 0.3985 | 0.0634 | 85 | 1.2427 | 4587088 |
|
73 |
+
| 0.4226 | 0.0671 | 90 | 1.2237 | 4851192 |
|
74 |
+
| 0.335 | 0.0708 | 95 | 1.2147 | 5128392 |
|
75 |
+
| 0.3381 | 0.0746 | 100 | 1.2222 | 5402272 |
|
76 |
+
| 0.2826 | 0.0783 | 105 | 1.2087 | 5672736 |
|
77 |
+
| 0.3858 | 0.0820 | 110 | 1.2034 | 5945640 |
|
78 |
+
| 0.2916 | 0.0858 | 115 | 1.2149 | 6213712 |
|
79 |
+
| 0.1812 | 0.0895 | 120 | 1.2091 | 6480944 |
|
80 |
+
| 0.2106 | 0.0932 | 125 | 1.2118 | 6749984 |
|
81 |
+
| 0.2204 | 0.0969 | 130 | 1.2141 | 7022400 |
|
82 |
+
| 0.26 | 0.1007 | 135 | 1.2027 | 7290400 |
|
83 |
+
| 0.2057 | 0.1044 | 140 | 1.1904 | 7559840 |
|
84 |
+
| 0.1634 | 0.1081 | 145 | 1.2047 | 7828648 |
|
85 |
+
| 0.2798 | 0.1119 | 150 | 1.1945 | 8103064 |
|
86 |
+
| 0.218 | 0.1156 | 155 | 1.1967 | 8374240 |
|
87 |
+
| 0.2143 | 0.1193 | 160 | 1.1997 | 8649728 |
|
88 |
+
| 0.282 | 0.1230 | 165 | 1.1903 | 8915992 |
|
89 |
+
| 0.2373 | 0.1268 | 170 | 1.1923 | 9179120 |
|
90 |
+
| 0.186 | 0.1305 | 175 | 1.1857 | 9444544 |
|
91 |
+
| 0.2151 | 0.1342 | 180 | 1.1888 | 9719840 |
|
92 |
+
| 0.1982 | 0.1380 | 185 | 1.1860 | 9987088 |
|
93 |
+
| 0.2093 | 0.1417 | 190 | 1.1933 | 10260608 |
|
94 |
+
| 0.1927 | 0.1454 | 195 | 1.1829 | 10531552 |
|
95 |
+
| 0.2653 | 0.1491 | 200 | 1.1856 | 10803216 |
|
96 |
+
| 0.1893 | 0.1529 | 205 | 1.1855 | 11077920 |
|
97 |
+
| 0.2772 | 0.1566 | 210 | 1.1858 | 11343864 |
|
98 |
+
| 0.2151 | 0.1603 | 215 | 1.1842 | 11616992 |
|
99 |
+
| 0.2485 | 0.1641 | 220 | 1.1834 | 11892040 |
|
100 |
+
| 0.2226 | 0.1678 | 225 | 1.1787 | 12162184 |
|
101 |
+
| 0.1264 | 0.1715 | 230 | 1.1788 | 12429176 |
|
102 |
+
| 0.1665 | 0.1753 | 235 | 1.1733 | 12696000 |
|
103 |
+
| 0.1108 | 0.1790 | 240 | 1.1739 | 12965168 |
|
104 |
+
| 0.185 | 0.1827 | 245 | 1.1671 | 13239112 |
|
105 |
+
| 0.2626 | 0.1864 | 250 | 1.1734 | 13514032 |
|
106 |
+
| 0.1595 | 0.1902 | 255 | 1.1717 | 13785752 |
|
107 |
+
| 0.2451 | 0.1939 | 260 | 1.1687 | 14059520 |
|
108 |
+
| 0.2444 | 0.1976 | 265 | 1.1697 | 14324224 |
|
109 |
+
| 0.2495 | 0.2014 | 270 | 1.1663 | 14590360 |
|
110 |
+
| 0.2167 | 0.2051 | 275 | 1.1653 | 14853096 |
|
111 |
+
| 0.1973 | 0.2088 | 280 | 1.1688 | 15122224 |
|
112 |
+
| 0.1801 | 0.2125 | 285 | 1.1663 | 15394480 |
|
113 |
+
| 0.1666 | 0.2163 | 290 | 1.1666 | 15661024 |
|
114 |
+
| 0.1642 | 0.2200 | 295 | 1.1688 | 15928512 |
|
115 |
+
| 0.2069 | 0.2237 | 300 | 1.1648 | 16201536 |
|
116 |
+
| 0.1672 | 0.2275 | 305 | 1.1624 | 16470776 |
|
117 |
+
| 0.1446 | 0.2312 | 310 | 1.1688 | 16744768 |
|
118 |
+
| 0.1332 | 0.2349 | 315 | 1.1606 | 17008240 |
|
119 |
+
| 0.1447 | 0.2386 | 320 | 1.1595 | 17273856 |
|
120 |
+
| 0.1407 | 0.2424 | 325 | 1.1664 | 17549784 |
|
121 |
+
| 0.2198 | 0.2461 | 330 | 1.1601 | 17822968 |
|
122 |
+
| 0.1968 | 0.2498 | 335 | 1.1568 | 18095368 |
|
123 |
+
| 0.1826 | 0.2536 | 340 | 1.1608 | 18371224 |
|
124 |
+
| 0.1624 | 0.2573 | 345 | 1.1594 | 18648808 |
|
125 |
+
| 0.1164 | 0.2610 | 350 | 1.1552 | 18912232 |
|
126 |
+
| 0.1232 | 0.2647 | 355 | 1.1584 | 19180288 |
|
127 |
+
| 0.2007 | 0.2685 | 360 | 1.1596 | 19453232 |
|
128 |
+
| 0.163 | 0.2722 | 365 | 1.1516 | 19727312 |
|
129 |
+
| 0.1141 | 0.2759 | 370 | 1.1587 | 20011560 |
|
130 |
+
| 0.1235 | 0.2797 | 375 | 1.1526 | 20283680 |
|
131 |
+
| 0.1914 | 0.2834 | 380 | 1.1499 | 20550296 |
|
132 |
+
| 0.1682 | 0.2871 | 385 | 1.1512 | 20821616 |
|
133 |
+
| 0.1194 | 0.2908 | 390 | 1.1508 | 21095920 |
|
134 |
+
| 0.1079 | 0.2946 | 395 | 1.1529 | 21372392 |
|
135 |
+
| 0.166 | 0.2983 | 400 | 1.1514 | 21644592 |
|
136 |
+
| 0.1262 | 0.3020 | 405 | 1.1497 | 21914152 |
|
137 |
+
| 0.1624 | 0.3058 | 410 | 1.1526 | 22186664 |
|
138 |
+
| 0.1772 | 0.3095 | 415 | 1.1478 | 22459632 |
|
139 |
+
| 0.2304 | 0.3132 | 420 | 1.1476 | 22730888 |
|
140 |
+
| 0.0887 | 0.3169 | 425 | 1.1456 | 22992464 |
|
141 |
+
| 0.1033 | 0.3207 | 430 | 1.1478 | 23263416 |
|
142 |
+
| 0.1526 | 0.3244 | 435 | 1.1456 | 23530720 |
|
143 |
+
| 0.1425 | 0.3281 | 440 | 1.1433 | 23792528 |
|
144 |
+
| 0.1928 | 0.3319 | 445 | 1.1422 | 24066464 |
|
145 |
+
| 0.1651 | 0.3356 | 450 | 1.1433 | 24328952 |
|
146 |
+
| 0.1117 | 0.3393 | 455 | 1.1480 | 24589040 |
|
147 |
+
| 0.1578 | 0.3430 | 460 | 1.1464 | 24861792 |
|
148 |
+
| 0.1554 | 0.3468 | 465 | 1.1408 | 25140824 |
|
149 |
+
| 0.1505 | 0.3505 | 470 | 1.1425 | 25408400 |
|
150 |
+
| 0.1613 | 0.3542 | 475 | 1.1416 | 25681448 |
|
151 |
+
| 0.1858 | 0.3580 | 480 | 1.1394 | 25948192 |
|
152 |
+
| 0.1362 | 0.3617 | 485 | 1.1410 | 26216376 |
|
153 |
+
| 0.2001 | 0.3654 | 490 | 1.1416 | 26485904 |
|
154 |
+
| 0.153 | 0.3691 | 495 | 1.1407 | 26753784 |
|
155 |
+
| 0.2446 | 0.3729 | 500 | 1.1432 | 27019552 |
|
156 |
+
| 0.1468 | 0.3766 | 505 | 1.1389 | 27293064 |
|
157 |
+
| 0.1343 | 0.3803 | 510 | 1.1388 | 27559304 |
|
158 |
+
| 0.1486 | 0.3841 | 515 | 1.1379 | 27832208 |
|
159 |
+
| 0.1227 | 0.3878 | 520 | 1.1369 | 28099304 |
|
160 |
+
| 0.185 | 0.3915 | 525 | 1.1392 | 28366024 |
|
161 |
+
| 0.1528 | 0.3952 | 530 | 1.1389 | 28634400 |
|
162 |
+
| 0.1835 | 0.3990 | 535 | 1.1360 | 28906280 |
|
163 |
+
| 0.1858 | 0.4027 | 540 | 1.1376 | 29174248 |
|
164 |
+
| 0.1313 | 0.4064 | 545 | 1.1363 | 29446120 |
|
165 |
+
| 0.1405 | 0.4102 | 550 | 1.1334 | 29716480 |
|
166 |
+
| 0.1816 | 0.4139 | 555 | 1.1334 | 29985760 |
|
167 |
+
| 0.2154 | 0.4176 | 560 | 1.1322 | 30252704 |
|
168 |
+
| 0.1683 | 0.4213 | 565 | 1.1311 | 30523920 |
|
169 |
+
| 0.1828 | 0.4251 | 570 | 1.1330 | 30795368 |
|
170 |
+
| 0.1506 | 0.4288 | 575 | 1.1302 | 31062848 |
|
171 |
+
| 0.1773 | 0.4325 | 580 | 1.1313 | 31336984 |
|
172 |
+
| 0.1544 | 0.4363 | 585 | 1.1319 | 31611648 |
|
173 |
+
| 0.1387 | 0.4400 | 590 | 1.1301 | 31880344 |
|
174 |
+
| 0.1977 | 0.4437 | 595 | 1.1292 | 32151312 |
|
175 |
+
| 0.1209 | 0.4474 | 600 | 1.1328 | 32418584 |
|
176 |
+
| 0.1392 | 0.4512 | 605 | 1.1307 | 32693992 |
|
177 |
+
| 0.1996 | 0.4549 | 610 | 1.1291 | 32968608 |
|
178 |
+
| 0.2297 | 0.4586 | 615 | 1.1300 | 33237576 |
|
179 |
+
| 0.1792 | 0.4624 | 620 | 1.1284 | 33504216 |
|
180 |
+
| 0.1289 | 0.4661 | 625 | 1.1281 | 33778712 |
|
181 |
+
| 0.2102 | 0.4698 | 630 | 1.1286 | 34048008 |
|
182 |
+
| 0.1098 | 0.4735 | 635 | 1.1288 | 34318832 |
|
183 |
+
| 0.1766 | 0.4773 | 640 | 1.1280 | 34588104 |
|
184 |
+
| 0.1247 | 0.4810 | 645 | 1.1277 | 34863712 |
|
185 |
+
| 0.1875 | 0.4847 | 650 | 1.1256 | 35137368 |
|
186 |
+
| 0.1388 | 0.4885 | 655 | 1.1274 | 35401856 |
|
187 |
+
| 0.1543 | 0.4922 | 660 | 1.1260 | 35669288 |
|
188 |
+
| 0.1338 | 0.4959 | 665 | 1.1250 | 35938200 |
|
189 |
+
| 0.1478 | 0.4997 | 670 | 1.1261 | 36214008 |
|
190 |
+
| 0.078 | 0.5034 | 675 | 1.1283 | 36484712 |
|
191 |
+
| 0.1088 | 0.5071 | 680 | 1.1274 | 36756992 |
|
192 |
+
| 0.1612 | 0.5108 | 685 | 1.1240 | 37024952 |
|
193 |
+
| 0.141 | 0.5146 | 690 | 1.1247 | 37288544 |
|
194 |
+
| 0.1367 | 0.5183 | 695 | 1.1265 | 37560624 |
|
195 |
+
| 0.158 | 0.5220 | 700 | 1.1268 | 37829640 |
|
196 |
+
| 0.1697 | 0.5258 | 705 | 1.1262 | 38100088 |
|
197 |
+
| 0.1348 | 0.5295 | 710 | 1.1253 | 38367944 |
|
198 |
+
| 0.1406 | 0.5332 | 715 | 1.1238 | 38640424 |
|
199 |
+
| 0.1578 | 0.5369 | 720 | 1.1266 | 38907976 |
|
200 |
+
| 0.1835 | 0.5407 | 725 | 1.1277 | 39174392 |
|
201 |
+
| 0.2109 | 0.5444 | 730 | 1.1236 | 39448744 |
|
202 |
+
| 0.1624 | 0.5481 | 735 | 1.1219 | 39721744 |
|
203 |
+
| 0.1249 | 0.5519 | 740 | 1.1256 | 39995928 |
|
204 |
+
| 0.1682 | 0.5556 | 745 | 1.1246 | 40273496 |
|
205 |
+
| 0.1751 | 0.5593 | 750 | 1.1226 | 40547520 |
|
206 |
+
| 0.1961 | 0.5630 | 755 | 1.1253 | 40821000 |
|
207 |
+
| 0.1429 | 0.5668 | 760 | 1.1276 | 41093648 |
|
208 |
+
| 0.1388 | 0.5705 | 765 | 1.1218 | 41362480 |
|
209 |
+
| 0.1274 | 0.5742 | 770 | 1.1220 | 41633192 |
|
210 |
+
| 0.1763 | 0.5780 | 775 | 1.1238 | 41902024 |
|
211 |
+
| 0.1543 | 0.5817 | 780 | 1.1229 | 42171960 |
|
212 |
+
| 0.1535 | 0.5854 | 785 | 1.1226 | 42445296 |
|
213 |
+
| 0.1456 | 0.5891 | 790 | 1.1235 | 42710176 |
|
214 |
+
| 0.0793 | 0.5929 | 795 | 1.1218 | 42982376 |
|
215 |
+
| 0.2123 | 0.5966 | 800 | 1.1231 | 43246288 |
|
216 |
+
| 0.1695 | 0.6003 | 805 | 1.1223 | 43518184 |
|
217 |
+
| 0.1431 | 0.6041 | 810 | 1.1233 | 43787688 |
|
218 |
+
| 0.1313 | 0.6078 | 815 | 1.1231 | 44058296 |
|
219 |
+
| 0.1916 | 0.6115 | 820 | 1.1199 | 44323824 |
|
220 |
+
| 0.1367 | 0.6152 | 825 | 1.1183 | 44600736 |
|
221 |
+
| 0.1064 | 0.6190 | 830 | 1.1228 | 44871640 |
|
222 |
+
| 0.0885 | 0.6227 | 835 | 1.1214 | 45136648 |
|
223 |
+
| 0.1405 | 0.6264 | 840 | 1.1183 | 45412936 |
|
224 |
+
| 0.1229 | 0.6302 | 845 | 1.1195 | 45676832 |
|
225 |
+
| 0.1544 | 0.6339 | 850 | 1.1204 | 45954952 |
|
226 |
+
| 0.1298 | 0.6376 | 855 | 1.1215 | 46232064 |
|
227 |
+
| 0.207 | 0.6413 | 860 | 1.1232 | 46500536 |
|
228 |
+
| 0.1036 | 0.6451 | 865 | 1.1216 | 46768904 |
|
229 |
+
| 0.1644 | 0.6488 | 870 | 1.1206 | 47038312 |
|
230 |
+
| 0.1903 | 0.6525 | 875 | 1.1191 | 47305784 |
|
231 |
+
| 0.1797 | 0.6563 | 880 | 1.1197 | 47567088 |
|
232 |
+
| 0.1451 | 0.6600 | 885 | 1.1186 | 47835208 |
|
233 |
+
| 0.1295 | 0.6637 | 890 | 1.1170 | 48110200 |
|
234 |
+
| 0.0897 | 0.6674 | 895 | 1.1182 | 48387944 |
|
235 |
+
| 0.1365 | 0.6712 | 900 | 1.1182 | 48654496 |
|
236 |
+
| 0.1166 | 0.6749 | 905 | 1.1168 | 48925776 |
|
237 |
+
| 0.1172 | 0.6786 | 910 | 1.1220 | 49198040 |
|
238 |
+
| 0.1452 | 0.6824 | 915 | 1.1210 | 49470608 |
|
239 |
+
| 0.1495 | 0.6861 | 920 | 1.1190 | 49741056 |
|
240 |
+
| 0.113 | 0.6898 | 925 | 1.1189 | 50014608 |
|
241 |
+
| 0.1343 | 0.6935 | 930 | 1.1204 | 50288136 |
|
242 |
+
| 0.1857 | 0.6973 | 935 | 1.1175 | 50558880 |
|
243 |
+
| 0.1177 | 0.7010 | 940 | 1.1170 | 50828624 |
|
244 |
+
| 0.169 | 0.7047 | 945 | 1.1168 | 51102088 |
|
245 |
+
| 0.2074 | 0.7085 | 950 | 1.1151 | 51369824 |
|
246 |
+
| 0.1161 | 0.7122 | 955 | 1.1168 | 51641024 |
|
247 |
+
| 0.1411 | 0.7159 | 960 | 1.1170 | 51909240 |
|
248 |
+
| 0.1514 | 0.7196 | 965 | 1.1158 | 52177496 |
|
249 |
+
| 0.1911 | 0.7234 | 970 | 1.1176 | 52450824 |
|
250 |
+
| 0.163 | 0.7271 | 975 | 1.1162 | 52729824 |
|
251 |
+
| 0.0962 | 0.7308 | 980 | 1.1152 | 52995240 |
|
252 |
+
| 0.1413 | 0.7346 | 985 | 1.1180 | 53263416 |
|
253 |
+
| 0.2341 | 0.7383 | 990 | 1.1176 | 53527672 |
|
254 |
+
| 0.109 | 0.7420 | 995 | 1.1147 | 53793040 |
|
255 |
+
| 0.1362 | 0.7457 | 1000 | 1.1141 | 54066168 |
|
256 |
+
| 0.1523 | 0.7495 | 1005 | 1.1145 | 54337184 |
|
257 |
+
| 0.1541 | 0.7532 | 1010 | 1.1154 | 54613256 |
|
258 |
+
| 0.1942 | 0.7569 | 1015 | 1.1168 | 54884736 |
|
259 |
+
| 0.1567 | 0.7607 | 1020 | 1.1169 | 55156512 |
|
260 |
+
| 0.1341 | 0.7644 | 1025 | 1.1186 | 55429832 |
|
261 |
+
| 0.0783 | 0.7681 | 1030 | 1.1167 | 55703192 |
|
262 |
+
| 0.1526 | 0.7718 | 1035 | 1.1157 | 55978080 |
|
263 |
+
| 0.201 | 0.7756 | 1040 | 1.1135 | 56246208 |
|
264 |
+
| 0.1721 | 0.7793 | 1045 | 1.1119 | 56518304 |
|
265 |
+
| 0.1958 | 0.7830 | 1050 | 1.1158 | 56786584 |
|
266 |
+
| 0.1789 | 0.7868 | 1055 | 1.1182 | 57058752 |
|
267 |
+
| 0.1706 | 0.7905 | 1060 | 1.1138 | 57340616 |
|
268 |
+
| 0.1119 | 0.7942 | 1065 | 1.1121 | 57618032 |
|
269 |
+
| 0.1033 | 0.7979 | 1070 | 1.1150 | 57890944 |
|
270 |
+
| 0.0648 | 0.8017 | 1075 | 1.1166 | 58157784 |
|
271 |
+
| 0.1655 | 0.8054 | 1080 | 1.1131 | 58428976 |
|
272 |
+
| 0.1665 | 0.8091 | 1085 | 1.1122 | 58700640 |
|
273 |
+
| 0.245 | 0.8129 | 1090 | 1.1139 | 58964952 |
|
274 |
+
| 0.0995 | 0.8166 | 1095 | 1.1137 | 59233864 |
|
275 |
+
| 0.1095 | 0.8203 | 1100 | 1.1134 | 59502808 |
|
276 |
+
| 0.1329 | 0.8241 | 1105 | 1.1133 | 59775576 |
|
277 |
+
| 0.2066 | 0.8278 | 1110 | 1.1127 | 60051688 |
|
278 |
+
| 0.0901 | 0.8315 | 1115 | 1.1136 | 60315488 |
|
279 |
+
| 0.1157 | 0.8352 | 1120 | 1.1137 | 60590808 |
|
280 |
+
| 0.178 | 0.8390 | 1125 | 1.1135 | 60866712 |
|
281 |
+
| 0.1368 | 0.8427 | 1130 | 1.1139 | 61137936 |
|
282 |
+
| 0.1683 | 0.8464 | 1135 | 1.1148 | 61405640 |
|
283 |
+
| 0.193 | 0.8502 | 1140 | 1.1094 | 61679192 |
|
284 |
+
| 0.0919 | 0.8539 | 1145 | 1.1099 | 61950280 |
|
285 |
+
| 0.1054 | 0.8576 | 1150 | 1.1116 | 62221520 |
|
286 |
+
| 0.1405 | 0.8613 | 1155 | 1.1089 | 62492616 |
|
287 |
+
| 0.2065 | 0.8651 | 1160 | 1.1088 | 62768672 |
|
288 |
+
| 0.0888 | 0.8688 | 1165 | 1.1109 | 63034280 |
|
289 |
+
| 0.107 | 0.8725 | 1170 | 1.1133 | 63305184 |
|
290 |
+
| 0.1131 | 0.8763 | 1175 | 1.1138 | 63570736 |
|
291 |
+
| 0.154 | 0.8800 | 1180 | 1.1125 | 63839160 |
|
292 |
+
| 0.2166 | 0.8837 | 1185 | 1.1128 | 64107448 |
|
293 |
+
| 0.17 | 0.8874 | 1190 | 1.1112 | 64384992 |
|
294 |
+
| 0.097 | 0.8912 | 1195 | 1.1101 | 64654992 |
|
295 |
+
| 0.1523 | 0.8949 | 1200 | 1.1113 | 64923728 |
|
296 |
+
| 0.1752 | 0.8986 | 1205 | 1.1110 | 65194568 |
|
297 |
+
| 0.1477 | 0.9024 | 1210 | 1.1107 | 65468632 |
|
298 |
+
| 0.124 | 0.9061 | 1215 | 1.1104 | 65732912 |
|
299 |
+
| 0.1321 | 0.9098 | 1220 | 1.1107 | 65998584 |
|
300 |
+
| 0.1027 | 0.9135 | 1225 | 1.1109 | 66275336 |
|
301 |
+
| 0.1562 | 0.9173 | 1230 | 1.1131 | 66549384 |
|
302 |
+
| 0.1955 | 0.9210 | 1235 | 1.1105 | 66817160 |
|
303 |
+
| 0.1341 | 0.9247 | 1240 | 1.1086 | 67076696 |
|
304 |
+
| 0.1253 | 0.9285 | 1245 | 1.1099 | 67347784 |
|
305 |
+
| 0.2128 | 0.9322 | 1250 | 1.1119 | 67614184 |
|
306 |
+
| 0.1334 | 0.9359 | 1255 | 1.1100 | 67883936 |
|
307 |
+
| 0.1227 | 0.9396 | 1260 | 1.1085 | 68159848 |
|
308 |
+
| 0.1073 | 0.9434 | 1265 | 1.1110 | 68434512 |
|
309 |
+
| 0.126 | 0.9471 | 1270 | 1.1105 | 68701016 |
|
310 |
+
| 0.1085 | 0.9508 | 1275 | 1.1112 | 68971216 |
|
311 |
+
| 0.1942 | 0.9546 | 1280 | 1.1079 | 69243880 |
|
312 |
+
| 0.1107 | 0.9583 | 1285 | 1.1082 | 69513872 |
|
313 |
+
| 0.1296 | 0.9620 | 1290 | 1.1091 | 69781928 |
|
314 |
+
| 0.1981 | 0.9657 | 1295 | 1.1087 | 70046808 |
|
315 |
+
| 0.2142 | 0.9695 | 1300 | 1.1073 | 70319584 |
|
316 |
+
| 0.145 | 0.9732 | 1305 | 1.1094 | 70592160 |
|
317 |
+
| 0.2102 | 0.9769 | 1310 | 1.1095 | 70861072 |
|
318 |
+
| 0.1017 | 0.9807 | 1315 | 1.1088 | 71127088 |
|
319 |
+
| 0.1419 | 0.9844 | 1320 | 1.1090 | 71394680 |
|
320 |
+
| 0.1959 | 0.9881 | 1325 | 1.1061 | 71667208 |
|
321 |
+
| 0.205 | 0.9918 | 1330 | 1.1043 | 71935800 |
|
322 |
+
| 0.1699 | 0.9956 | 1335 | 1.1050 | 72203200 |
|
323 |
+
| 0.1449 | 0.9993 | 1340 | 1.1072 | 72483520 |
|
324 |
|
325 |
|
326 |
### Framework versions
|
model-00001-of-00002.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4988025760
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8e101cc03635f475c21581a1219f37e28da5bafce8905f1a639f9275b4c490e1
|
3 |
size 4988025760
|
model-00002-of-00002.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 240691728
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:996e8f9529a0ee25fe8c974f16ca59962105eb133ee2116f96bd94564925612c
|
3 |
size 240691728
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 5624
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:46a6d40faeff91c382a19a33987dc6797c2e024675393be5b52506154c13687d
|
3 |
size 5624
|