venetis commited on
Commit
215effe
1 Parent(s): ce26b76

End of training

Browse files
Files changed (2) hide show
  1. README.md +8 -20
  2. adapter_model.bin +1 -1
README.md CHANGED
@@ -31,7 +31,7 @@ datasets:
31
  type: sharegpt
32
  conversation: llama3
33
  dataset_prepared_path:
34
- val_set_size: 0.05
35
  output_dir: ./outputs_lora-out
36
  hub_model_id: venetis/llama3-8b-hermes-sandals-100
37
 
@@ -46,7 +46,8 @@ lora_fan_in_fan_out:
46
 
47
 
48
  sequence_len: 4096
49
- sample_packing: false
 
50
  pad_to_sequence_len: true
51
  wandb_project: llama-3-8b-hermes-sandals-first100
52
  wandb_entity: venetispall
@@ -95,7 +96,7 @@ special_tokens:
95
 
96
  This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on the None dataset.
97
  It achieves the following results on the evaluation set:
98
- - Loss: 0.6973
99
 
100
  ## Model description
101
 
@@ -129,23 +130,10 @@ The following hyperparameters were used during training:
129
 
130
  | Training Loss | Epoch | Step | Validation Loss |
131
  |:-------------:|:------:|:----:|:---------------:|
132
- | 1.1059 | 0.0833 | 1 | 0.9750 |
133
- | 0.7822 | 0.25 | 3 | 0.9658 |
134
- | 0.9619 | 0.5 | 6 | 0.8441 |
135
- | 0.9988 | 0.75 | 9 | 0.7469 |
136
- | 1.0968 | 1.0 | 12 | 0.7146 |
137
- | 0.9651 | 1.25 | 15 | 0.6966 |
138
- | 1.0208 | 1.5 | 18 | 0.6926 |
139
- | 0.6837 | 1.75 | 21 | 0.7035 |
140
- | 0.8704 | 2.0 | 24 | 0.6958 |
141
- | 0.4346 | 2.25 | 27 | 0.6800 |
142
- | 0.5709 | 2.5 | 30 | 0.6885 |
143
- | 0.331 | 2.75 | 33 | 0.6952 |
144
- | 0.5077 | 3.0 | 36 | 0.6953 |
145
- | 0.6047 | 3.25 | 39 | 0.6939 |
146
- | 0.5974 | 3.5 | 42 | 0.6961 |
147
- | 0.5238 | 3.75 | 45 | 0.6963 |
148
- | 0.7728 | 4.0 | 48 | 0.6973 |
149
 
150
 
151
  ### Framework versions
 
31
  type: sharegpt
32
  conversation: llama3
33
  dataset_prepared_path:
34
+ val_set_size: 0.15
35
  output_dir: ./outputs_lora-out
36
  hub_model_id: venetis/llama3-8b-hermes-sandals-100
37
 
 
46
 
47
 
48
  sequence_len: 4096
49
+ sample_packing: true
50
+ eval_sample_packing: false
51
  pad_to_sequence_len: true
52
  wandb_project: llama-3-8b-hermes-sandals-first100
53
  wandb_entity: venetispall
 
96
 
97
  This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on the None dataset.
98
  It achieves the following results on the evaluation set:
99
+ - Loss: 0.9990
100
 
101
  ## Model description
102
 
 
130
 
131
  | Training Loss | Epoch | Step | Validation Loss |
132
  |:-------------:|:------:|:----:|:---------------:|
133
+ | 0.6949 | 1.0 | 1 | 1.0314 |
134
+ | 0.3183 | 1.3333 | 2 | 1.0297 |
135
+ | 0.7635 | 2.0 | 3 | 1.0211 |
136
+ | 1.0254 | 2.6667 | 4 | 0.9990 |
 
 
 
 
 
 
 
 
 
 
 
 
 
137
 
138
 
139
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d0ff7401a6dc3aa95a53ca2d5d6e0a5a7052c4f8390449f39ce2d1a8ec5eabf6
3
  size 167934026
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ef226f4d8644162f1b2001c61bd69c56b486320bfabfcafcf603f8e5af0e5b6c
3
  size 167934026