File size: 4,272 Bytes
895ec39 6ec7ec2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 |
1289.4068 seconds used for training.
21.49 minutes used for training.
Peak reserved memory = 9.545 GB.
Peak reserved memory for training = 4.018 GB.
Peak reserved memory % of max memory = 43.058 %.
Peak reserved memory for training % of max memory = 18.125 %.
args = TrainingArguments(
per_device_train_batch_size = 2,
gradient_accumulation_steps = 4,
warmup_steps = 10, # Augmenté le nombre de steps de warmup
max_steps = 200, # Augmenté le nombre total de steps
learning_rate = 1e-4, # Réduit le taux d'apprentissage
fp16 = not torch.cuda.is_bf16_supported(),
bf16 = torch.cuda.is_bf16_supported(),
logging_steps = 1,
optim = "adamw_8bit",
weight_decay = 0.01,
lr_scheduler_type = "linear",
seed = 42,
output_dir = "outputs",
==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1
\\ /| Num examples = 399 | Num Epochs = 4
O^O/ \_/ \ Batch size per device = 2 | Gradient Accumulation steps = 4
\ / Total batch size = 8 | Total steps = 200
"-____-" Number of trainable parameters = 20,971,520
[200/200 21:17, Epoch 4/4]
Step Training Loss
1 2.027900
2 2.008700
3 1.946100
4 1.924700
5 1.995000
6 1.999000
7 1.870100
8 1.891400
9 1.807600
10 1.723200
11 1.665100
12 1.541000
13 1.509100
14 1.416600
15 1.398600
16 1.233200
17 1.172100
18 1.272100
19 1.146000
20 1.179000
21 1.206400
22 1.095400
23 0.937300
24 1.214300
25 1.040200
26 1.183400
27 1.033900
28 0.953100
29 0.935700
30 0.962200
31 0.908900
32 0.924900
33 0.931000
34 1.011300
35 0.951900
36 0.936000
37 0.903000
38 0.906900
39 0.945700
40 0.827000
41 0.931800
42 0.919600
43 0.926900
44 0.932900
45 0.872700
46 0.795200
47 0.888700
48 0.956800
49 1.004200
50 0.859500
51 0.802500
52 0.855400
53 0.885500
54 1.026600
55 0.844100
56 0.879800
57 0.797400
58 0.885300
59 0.842800
60 0.861600
61 0.789100
62 0.861600
63 0.856700
64 0.929200
65 0.782500
66 0.713600
67 0.781000
68 0.765100
69 0.784700
70 0.869500
71 0.742900
72 0.787900
73 0.750800
74 0.931700
75 0.713000
76 0.832100
77 0.928300
78 0.777600
79 0.694000
80 0.835400
81 0.822000
82 0.754600
83 0.813400
84 0.868800
85 0.732400
86 0.803700
87 0.694400
88 0.771300
89 0.864400
90 0.646700
91 0.690800
92 0.695000
93 0.732300
94 0.766900
95 0.864100
96 0.867200
97 0.774300
98 0.797700
99 0.772100
100 0.906700
101 0.693400
102 0.685500
103 0.712200
104 0.678400
105 0.761900
106 0.705300
107 0.775700
108 0.627600
109 0.599300
110 0.615100
111 0.618200
112 0.668700
113 0.699900
114 0.577000
115 0.711600
116 0.692900
117 0.585400
118 0.646400
119 0.569200
120 0.752300
121 0.745000
122 0.690100
123 0.744700
124 0.665800
125 0.866100
126 0.707400
127 0.679300
128 0.591400
129 0.655100
130 0.734000
131 0.637900
132 0.733900
133 0.652500
134 0.685400
135 0.641300
136 0.608200
137 0.754100
138 0.753700
139 0.671000
140 0.767200
141 0.668700
142 0.630300
143 0.734700
144 0.767700
145 0.722200
146 0.694400
147 0.710100
148 0.696300
149 0.612600
150 0.670400
151 0.512900
152 0.675100
153 0.579900
154 0.622900
155 0.652500
156 0.649200
157 0.546700
158 0.521600
159 0.522200
160 0.589400
161 0.552600
162 0.630700
163 0.595600
164 0.614300
165 0.489400
166 0.634500
167 0.620800
168 0.618600
169 0.637900
170 0.553900
171 0.656000
172 0.644000
173 0.694300
174 0.608900
175 0.673000
176 0.612500
177 0.654200
178 0.639200
179 0.599100
180 0.642100
181 0.529700
182 0.614000
183 0.582900
184 0.765100
185 0.502700
186 0.564300
187 0.740200
188 0.636100
189 0.638800
190 0.560100
191 0.620000
192 0.712800
193 0.531000
194 0.591600
195 0.608600
196 0.671800
197 0.572900
198 0.600900
199 0.586800
200 0.545900
---
base_model: unsloth/llama-3-8b-bnb-4bit
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- gguf
---
# Uploaded model
- **Developed by:** Mathoufle13
- **License:** apache-2.0
- **Finetuned from model :** unsloth/llama-3-8b-bnb-4bit
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|