End of training
Browse files- README.md +7 -6
- adapter_model.bin +2 -2
README.md
CHANGED
@@ -71,16 +71,17 @@ pad_to_sequence_len: true
|
|
71 |
resume_from_checkpoint: null
|
72 |
s2_attention: null
|
73 |
sample_packing: false
|
74 |
-
save_steps:
|
75 |
save_strategy: steps
|
76 |
sequence_len: 4096
|
77 |
special_tokens:
|
78 |
-
pad_token:
|
79 |
strict: false
|
80 |
tf32: false
|
81 |
tokenizer_type: AutoTokenizer
|
82 |
train_on_inputs: false
|
83 |
val_set_size: 0.05
|
|
|
84 |
warmup_steps: 10
|
85 |
weight_decay: 0.0
|
86 |
xformers_attention: null
|
@@ -93,7 +94,7 @@ xformers_attention: null
|
|
93 |
|
94 |
This model is a fine-tuned version of [unsloth/Llama-3.2-3B-Instruct](https://huggingface.co/unsloth/Llama-3.2-3B-Instruct) on the None dataset.
|
95 |
It achieves the following results on the evaluation set:
|
96 |
-
- Loss:
|
97 |
|
98 |
## Model description
|
99 |
|
@@ -127,14 +128,14 @@ The following hyperparameters were used during training:
|
|
127 |
|
128 |
| Training Loss | Epoch | Step | Validation Loss |
|
129 |
|:-------------:|:-----:|:----:|:---------------:|
|
130 |
-
|
|
131 |
-
|
|
132 |
|
133 |
|
134 |
### Framework versions
|
135 |
|
136 |
- PEFT 0.13.0
|
137 |
-
- Transformers 4.45.
|
138 |
- Pytorch 2.3.1+cu121
|
139 |
- Datasets 2.21.0
|
140 |
- Tokenizers 0.20.0
|
|
|
71 |
resume_from_checkpoint: null
|
72 |
s2_attention: null
|
73 |
sample_packing: false
|
74 |
+
save_steps: 5
|
75 |
save_strategy: steps
|
76 |
sequence_len: 4096
|
77 |
special_tokens:
|
78 |
+
pad_token: ' '
|
79 |
strict: false
|
80 |
tf32: false
|
81 |
tokenizer_type: AutoTokenizer
|
82 |
train_on_inputs: false
|
83 |
val_set_size: 0.05
|
84 |
+
wandb_mode: disabled
|
85 |
warmup_steps: 10
|
86 |
weight_decay: 0.0
|
87 |
xformers_attention: null
|
|
|
94 |
|
95 |
This model is a fine-tuned version of [unsloth/Llama-3.2-3B-Instruct](https://huggingface.co/unsloth/Llama-3.2-3B-Instruct) on the None dataset.
|
96 |
It achieves the following results on the evaluation set:
|
97 |
+
- Loss: 5.0689
|
98 |
|
99 |
## Model description
|
100 |
|
|
|
128 |
|
129 |
| Training Loss | Epoch | Step | Validation Loss |
|
130 |
|:-------------:|:-----:|:----:|:---------------:|
|
131 |
+
| 5.4988 | 0.8 | 1 | 5.0751 |
|
132 |
+
| 5.2725 | 1.6 | 2 | 5.0689 |
|
133 |
|
134 |
|
135 |
### Framework versions
|
136 |
|
137 |
- PEFT 0.13.0
|
138 |
+
- Transformers 4.45.1
|
139 |
- Pytorch 2.3.1+cu121
|
140 |
- Datasets 2.21.0
|
141 |
- Tokenizers 0.20.0
|
adapter_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f63cc530a8bbed71ec5c9e53b4a02a21d00fd2376226c3f48343ec58132cbbe5
|
3 |
+
size 982663982
|