File size: 13,140 Bytes
5221720
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
---
library_name: transformers
license: llama3.2
base_model: meta-llama/Llama-3.2-1B
tags:
- trl
- sft
- generated_from_trainer
model-index:
- name: tmp
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# tmp

This model is a fine-tuned version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 1.3192

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1.41e-05
- train_batch_size: 2
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0

### Training results

| Training Loss | Epoch  | Step  | Validation Loss |
|:-------------:|:------:|:-----:|:---------------:|
| 1.5915        | 0.0134 | 50    | 1.2903          |
| 1.3717        | 0.0268 | 100   | 1.2596          |
| 1.0418        | 0.0401 | 150   | 1.2804          |
| 1.2548        | 0.0535 | 200   | 1.2606          |
| 1.3994        | 0.0669 | 250   | 1.2454          |
| 1.1584        | 0.0803 | 300   | 1.2351          |
| 1.0075        | 0.0937 | 350   | 1.2278          |
| 1.3926        | 0.1070 | 400   | 1.2254          |
| 1.2131        | 0.1204 | 450   | 1.2212          |
| 1.3407        | 0.1338 | 500   | 1.2100          |
| 1.042         | 0.1472 | 550   | 1.2228          |
| 1.398         | 0.1606 | 600   | 1.2049          |
| 0.9886        | 0.1739 | 650   | 1.2033          |
| 1.3415        | 0.1873 | 700   | 1.1979          |
| 1.414         | 0.2007 | 750   | 1.1973          |
| 1.0634        | 0.2141 | 800   | 1.1925          |
| 0.9591        | 0.2275 | 850   | 1.1846          |
| 1.4814        | 0.2408 | 900   | 1.1816          |
| 1.4658        | 0.2542 | 950   | 1.1804          |
| 1.3086        | 0.2676 | 1000  | 1.1789          |
| 0.9067        | 0.2810 | 1050  | 1.1678          |
| 1.0266        | 0.2944 | 1100  | 1.1679          |
| 1.6225        | 0.3077 | 1150  | 1.1694          |
| 1.206         | 0.3211 | 1200  | 1.1668          |
| 1.2348        | 0.3345 | 1250  | 1.1628          |
| 1.3967        | 0.3479 | 1300  | 1.1572          |
| 1.3526        | 0.3613 | 1350  | 1.1558          |
| 1.0515        | 0.3746 | 1400  | 1.1578          |
| 1.2215        | 0.3880 | 1450  | 1.1574          |
| 0.8743        | 0.4014 | 1500  | 1.1516          |
| 1.5303        | 0.4148 | 1550  | 1.1461          |
| 1.1828        | 0.4282 | 1600  | 1.1523          |
| 0.9266        | 0.4415 | 1650  | 1.1443          |
| 1.3904        | 0.4549 | 1700  | 1.1358          |
| 1.138         | 0.4683 | 1750  | 1.1440          |
| 1.3723        | 0.4817 | 1800  | 1.1389          |
| 1.2073        | 0.4950 | 1850  | 1.1416          |
| 1.1665        | 0.5084 | 1900  | 1.1335          |
| 1.2742        | 0.5218 | 1950  | 1.1289          |
| 1.1677        | 0.5352 | 2000  | 1.1286          |
| 1.0681        | 0.5486 | 2050  | 1.1289          |
| 0.8086        | 0.5619 | 2100  | 1.1200          |
| 0.79          | 0.5753 | 2150  | 1.1245          |
| 0.9748        | 0.5887 | 2200  | 1.1275          |
| 1.2156        | 0.6021 | 2250  | 1.1204          |
| 0.8723        | 0.6155 | 2300  | 1.1151          |
| 0.9383        | 0.6288 | 2350  | 1.1160          |
| 1.0047        | 0.6422 | 2400  | 1.1169          |
| 0.9831        | 0.6556 | 2450  | 1.1192          |
| 0.7517        | 0.6690 | 2500  | 1.1098          |
| 1.3771        | 0.6824 | 2550  | 1.1128          |
| 1.0822        | 0.6957 | 2600  | 1.1158          |
| 1.0965        | 0.7091 | 2650  | 1.1073          |
| 1.0562        | 0.7225 | 2700  | 1.1108          |
| 1.0419        | 0.7359 | 2750  | 1.1184          |
| 0.8352        | 0.7493 | 2800  | 1.1060          |
| 1.0286        | 0.7626 | 2850  | 1.1043          |
| 0.9745        | 0.7760 | 2900  | 1.1019          |
| 0.9868        | 0.7894 | 2950  | 1.0965          |
| 1.0109        | 0.8028 | 3000  | 1.0978          |
| 1.437         | 0.8162 | 3050  | 1.0969          |
| 0.8           | 0.8295 | 3100  | 1.0882          |
| 1.1526        | 0.8429 | 3150  | 1.0912          |
| 1.052         | 0.8563 | 3200  | 1.0922          |
| 1.1689        | 0.8697 | 3250  | 1.0871          |
| 1.3413        | 0.8831 | 3300  | 1.0851          |
| 1.1188        | 0.8964 | 3350  | 1.0833          |
| 1.625         | 0.9098 | 3400  | 1.0867          |
| 1.3762        | 0.9232 | 3450  | 1.0816          |
| 1.0802        | 0.9366 | 3500  | 1.0825          |
| 0.9063        | 0.9500 | 3550  | 1.0767          |
| 1.0199        | 0.9633 | 3600  | 1.0783          |
| 1.5628        | 0.9767 | 3650  | 1.0750          |
| 1.0558        | 0.9901 | 3700  | 1.0774          |
| 0.7092        | 1.0035 | 3750  | 1.0841          |
| 0.7194        | 1.0169 | 3800  | 1.1159          |
| 0.8033        | 1.0302 | 3850  | 1.1189          |
| 0.5744        | 1.0436 | 3900  | 1.1321          |
| 0.6601        | 1.0570 | 3950  | 1.1199          |
| 0.8371        | 1.0704 | 4000  | 1.1241          |
| 0.8107        | 1.0838 | 4050  | 1.1225          |
| 0.6045        | 1.0971 | 4100  | 1.1291          |
| 0.6476        | 1.1105 | 4150  | 1.1280          |
| 0.6125        | 1.1239 | 4200  | 1.1228          |
| 0.5005        | 1.1373 | 4250  | 1.1239          |
| 0.7029        | 1.1507 | 4300  | 1.1302          |
| 0.7131        | 1.1640 | 4350  | 1.1217          |
| 0.7028        | 1.1774 | 4400  | 1.1266          |
| 0.7679        | 1.1908 | 4450  | 1.1164          |
| 0.7504        | 1.2042 | 4500  | 1.1235          |
| 0.7788        | 1.2176 | 4550  | 1.1253          |
| 0.6972        | 1.2309 | 4600  | 1.1166          |
| 1.0489        | 1.2443 | 4650  | 1.1204          |
| 0.4751        | 1.2577 | 4700  | 1.1185          |
| 0.5464        | 1.2711 | 4750  | 1.1254          |
| 0.7255        | 1.2845 | 4800  | 1.1202          |
| 0.8914        | 1.2978 | 4850  | 1.1193          |
| 0.5107        | 1.3112 | 4900  | 1.1252          |
| 0.8114        | 1.3246 | 4950  | 1.1243          |
| 0.6298        | 1.3380 | 5000  | 1.1261          |
| 0.9236        | 1.3514 | 5050  | 1.1245          |
| 0.7085        | 1.3647 | 5100  | 1.1213          |
| 0.7505        | 1.3781 | 5150  | 1.1127          |
| 0.7309        | 1.3915 | 5200  | 1.1178          |
| 0.5225        | 1.4049 | 5250  | 1.1216          |
| 0.8705        | 1.4182 | 5300  | 1.1134          |
| 0.5532        | 1.4316 | 5350  | 1.1193          |
| 0.4079        | 1.4450 | 5400  | 1.1142          |
| 0.5628        | 1.4584 | 5450  | 1.1138          |
| 0.716         | 1.4718 | 5500  | 1.1126          |
| 0.382         | 1.4851 | 5550  | 1.1150          |
| 0.6474        | 1.4985 | 5600  | 1.1143          |
| 0.6119        | 1.5119 | 5650  | 1.1112          |
| 0.4815        | 1.5253 | 5700  | 1.1047          |
| 0.8477        | 1.5387 | 5750  | 1.1158          |
| 0.8981        | 1.5520 | 5800  | 1.1108          |
| 0.639         | 1.5654 | 5850  | 1.1141          |
| 0.727         | 1.5788 | 5900  | 1.1137          |
| 0.8175        | 1.5922 | 5950  | 1.1116          |
| 0.7431        | 1.6056 | 6000  | 1.1152          |
| 0.6324        | 1.6189 | 6050  | 1.1145          |
| 1.0941        | 1.6323 | 6100  | 1.1142          |
| 0.6437        | 1.6457 | 6150  | 1.1082          |
| 0.5857        | 1.6591 | 6200  | 1.1103          |
| 0.4056        | 1.6725 | 6250  | 1.1137          |
| 0.6483        | 1.6858 | 6300  | 1.1069          |
| 0.6741        | 1.6992 | 6350  | 1.1027          |
| 0.7587        | 1.7126 | 6400  | 1.1087          |
| 0.7206        | 1.7260 | 6450  | 1.1156          |
| 0.451         | 1.7394 | 6500  | 1.1074          |
| 0.8237        | 1.7527 | 6550  | 1.1055          |
| 0.6333        | 1.7661 | 6600  | 1.1078          |
| 0.6317        | 1.7795 | 6650  | 1.1049          |
| 0.6688        | 1.7929 | 6700  | 1.1011          |
| 0.6598        | 1.8063 | 6750  | 1.1030          |
| 0.642         | 1.8196 | 6800  | 1.1059          |
| 0.587         | 1.8330 | 6850  | 1.1002          |
| 0.7726        | 1.8464 | 6900  | 1.0966          |
| 0.8227        | 1.8598 | 6950  | 1.1014          |
| 0.9093        | 1.8732 | 7000  | 1.1011          |
| 0.6117        | 1.8865 | 7050  | 1.0999          |
| 0.8338        | 1.8999 | 7100  | 1.0937          |
| 0.7215        | 1.9133 | 7150  | 1.0935          |
| 0.6242        | 1.9267 | 7200  | 1.0909          |
| 0.571         | 1.9401 | 7250  | 1.0990          |
| 0.7773        | 1.9534 | 7300  | 1.0955          |
| 0.7082        | 1.9668 | 7350  | 1.0955          |
| 0.7165        | 1.9802 | 7400  | 1.0982          |
| 0.5604        | 1.9936 | 7450  | 1.0985          |
| 0.3232        | 2.0070 | 7500  | 1.1841          |
| 0.3628        | 2.0203 | 7550  | 1.2569          |
| 0.4465        | 2.0337 | 7600  | 1.2687          |
| 0.3233        | 2.0471 | 7650  | 1.2720          |
| 0.281         | 2.0605 | 7700  | 1.2859          |
| 0.2199        | 2.0739 | 7750  | 1.2808          |
| 0.4787        | 2.0872 | 7800  | 1.2839          |
| 0.4288        | 2.1006 | 7850  | 1.2918          |
| 0.2966        | 2.1140 | 7900  | 1.3063          |
| 0.4248        | 2.1274 | 7950  | 1.3061          |
| 0.2717        | 2.1408 | 8000  | 1.2926          |
| 0.3561        | 2.1541 | 8050  | 1.3054          |
| 0.3736        | 2.1675 | 8100  | 1.2947          |
| 0.2936        | 2.1809 | 8150  | 1.3021          |
| 0.3316        | 2.1943 | 8200  | 1.2981          |
| 0.2931        | 2.2077 | 8250  | 1.3007          |
| 0.4591        | 2.2210 | 8300  | 1.2972          |
| 0.3023        | 2.2344 | 8350  | 1.3127          |
| 0.3407        | 2.2478 | 8400  | 1.3110          |
| 0.2361        | 2.2612 | 8450  | 1.3071          |
| 0.3509        | 2.2746 | 8500  | 1.3021          |
| 0.3868        | 2.2879 | 8550  | 1.3168          |
| 0.3218        | 2.3013 | 8600  | 1.3156          |
| 0.2913        | 2.3147 | 8650  | 1.3034          |
| 0.437         | 2.3281 | 8700  | 1.3214          |
| 0.4314        | 2.3415 | 8750  | 1.3136          |
| 0.3151        | 2.3548 | 8800  | 1.3085          |
| 0.3236        | 2.3682 | 8850  | 1.3100          |
| 0.3416        | 2.3816 | 8900  | 1.3050          |
| 0.3333        | 2.3950 | 8950  | 1.3151          |
| 0.2742        | 2.4083 | 9000  | 1.3153          |
| 0.3143        | 2.4217 | 9050  | 1.3243          |
| 0.4152        | 2.4351 | 9100  | 1.3164          |
| 0.219         | 2.4485 | 9150  | 1.3233          |
| 0.4057        | 2.4619 | 9200  | 1.3073          |
| 0.3571        | 2.4752 | 9250  | 1.3084          |
| 0.3163        | 2.4886 | 9300  | 1.3184          |
| 0.3185        | 2.5020 | 9350  | 1.3092          |
| 0.4474        | 2.5154 | 9400  | 1.3185          |
| 0.1927        | 2.5288 | 9450  | 1.3158          |
| 0.2362        | 2.5421 | 9500  | 1.3093          |
| 0.3651        | 2.5555 | 9550  | 1.3116          |
| 0.2531        | 2.5689 | 9600  | 1.3121          |
| 0.2219        | 2.5823 | 9650  | 1.3192          |
| 0.2546        | 2.5957 | 9700  | 1.3170          |
| 0.2841        | 2.6090 | 9750  | 1.3180          |
| 0.3039        | 2.6224 | 9800  | 1.3188          |
| 0.3866        | 2.6358 | 9850  | 1.3253          |
| 0.378         | 2.6492 | 9900  | 1.3143          |
| 0.2671        | 2.6626 | 9950  | 1.3143          |
| 0.2715        | 2.6759 | 10000 | 1.3220          |
| 0.2104        | 2.6893 | 10050 | 1.3275          |
| 0.2663        | 2.7027 | 10100 | 1.3186          |
| 0.3433        | 2.7161 | 10150 | 1.3201          |
| 0.3493        | 2.7295 | 10200 | 1.3169          |
| 0.3615        | 2.7428 | 10250 | 1.3184          |
| 0.2843        | 2.7562 | 10300 | 1.3196          |
| 0.263         | 2.7696 | 10350 | 1.3158          |
| 0.2971        | 2.7830 | 10400 | 1.3136          |
| 0.2198        | 2.7964 | 10450 | 1.3231          |
| 0.1814        | 2.8097 | 10500 | 1.3187          |
| 0.303         | 2.8231 | 10550 | 1.3175          |
| 0.4044        | 2.8365 | 10600 | 1.3171          |
| 0.2374        | 2.8499 | 10650 | 1.3212          |
| 0.2155        | 2.8633 | 10700 | 1.3229          |
| 0.2656        | 2.8766 | 10750 | 1.3251          |
| 0.2552        | 2.8900 | 10800 | 1.3184          |
| 0.2838        | 2.9034 | 10850 | 1.3198          |
| 0.2824        | 2.9168 | 10900 | 1.3192          |
| 0.2748        | 2.9302 | 10950 | 1.3172          |
| 0.2951        | 2.9435 | 11000 | 1.3193          |
| 0.3339        | 2.9569 | 11050 | 1.3196          |
| 0.3167        | 2.9703 | 11100 | 1.3195          |
| 0.2751        | 2.9837 | 11150 | 1.3192          |
| 0.3687        | 2.9971 | 11200 | 1.3192          |


### Framework versions

- Transformers 4.45.2
- Pytorch 2.4.1+cu121
- Datasets 3.0.1
- Tokenizers 0.20.1