End of training
Browse files- README.md +8 -8
- adapter_model.bin +1 -1
README.md
CHANGED
@@ -24,18 +24,18 @@ base_model_config: unsloth/Llama-3.2-3B-Instruct
|
|
24 |
bf16: auto
|
25 |
chat_template: llama3
|
26 |
dataset_prepared_path: null
|
27 |
-
dataset_type: instruct
|
28 |
datasets:
|
29 |
-
-
|
|
|
|
|
30 |
path: data/ds_example.json
|
31 |
-
type:
|
32 |
debug: null
|
33 |
deepspeed: null
|
34 |
early_stopping_patience: null
|
35 |
eval_max_new_tokens: 128
|
36 |
eval_table_size: null
|
37 |
evals_per_epoch: 4
|
38 |
-
file_format: json
|
39 |
flash_attention: true
|
40 |
fp16: null
|
41 |
fsdp: null
|
@@ -59,7 +59,7 @@ lora_model_dir: null
|
|
59 |
lora_r: 32
|
60 |
lora_target_linear: true
|
61 |
lr_scheduler: cosine
|
62 |
-
max_steps:
|
63 |
micro_batch_size: 2
|
64 |
mlflow_experiment_name: miner_id_24
|
65 |
mlflow_tracking_uri: http://94.156.8.49:5000
|
@@ -94,7 +94,7 @@ xformers_attention: null
|
|
94 |
|
95 |
This model is a fine-tuned version of [unsloth/Llama-3.2-3B-Instruct](https://huggingface.co/unsloth/Llama-3.2-3B-Instruct) on the None dataset.
|
96 |
It achieves the following results on the evaluation set:
|
97 |
-
- Loss:
|
98 |
|
99 |
## Model description
|
100 |
|
@@ -128,8 +128,8 @@ The following hyperparameters were used during training:
|
|
128 |
|
129 |
| Training Loss | Epoch | Step | Validation Loss |
|
130 |
|:-------------:|:-----:|:----:|:---------------:|
|
131 |
-
|
|
132 |
-
|
|
133 |
|
134 |
|
135 |
### Framework versions
|
|
|
24 |
bf16: auto
|
25 |
chat_template: llama3
|
26 |
dataset_prepared_path: null
|
|
|
27 |
datasets:
|
28 |
+
- data_files:
|
29 |
+
- ds_example.json
|
30 |
+
ds_type: json
|
31 |
path: data/ds_example.json
|
32 |
+
type: instruct
|
33 |
debug: null
|
34 |
deepspeed: null
|
35 |
early_stopping_patience: null
|
36 |
eval_max_new_tokens: 128
|
37 |
eval_table_size: null
|
38 |
evals_per_epoch: 4
|
|
|
39 |
flash_attention: true
|
40 |
fp16: null
|
41 |
fsdp: null
|
|
|
59 |
lora_r: 32
|
60 |
lora_target_linear: true
|
61 |
lr_scheduler: cosine
|
62 |
+
max_steps: 5
|
63 |
micro_batch_size: 2
|
64 |
mlflow_experiment_name: miner_id_24
|
65 |
mlflow_tracking_uri: http://94.156.8.49:5000
|
|
|
94 |
|
95 |
This model is a fine-tuned version of [unsloth/Llama-3.2-3B-Instruct](https://huggingface.co/unsloth/Llama-3.2-3B-Instruct) on the None dataset.
|
96 |
It achieves the following results on the evaluation set:
|
97 |
+
- Loss: 11.2689
|
98 |
|
99 |
## Model description
|
100 |
|
|
|
128 |
|
129 |
| Training Loss | Epoch | Step | Validation Loss |
|
130 |
|:-------------:|:-----:|:----:|:---------------:|
|
131 |
+
| 13.3042 | 0.8 | 1 | 11.5243 |
|
132 |
+
| 13.1934 | 1.6 | 2 | 11.2689 |
|
133 |
|
134 |
|
135 |
### Framework versions
|
adapter_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 982663982
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:bb4842d4922c5f75424db453ffc16bec1e112d1aed9337c4f5a87df1c18fab6e
|
3 |
size 982663982
|