venetis
/

llama3-8b-hermes-sandals-100

Generated from Trainer

Model card Files Files and versions Community

venetis commited on May 24, 2024

Commit

215effe

•

1 Parent(s): ce26b76

End of training

Files changed (2) hide show

README.md +8 -20
adapter_model.bin +1 -1

README.md CHANGED Viewed

@@ -31,7 +31,7 @@ datasets:
     type: sharegpt
     conversation: llama3
 dataset_prepared_path:
-val_set_size: 0.05
 output_dir: ./outputs_lora-out
 hub_model_id: venetis/llama3-8b-hermes-sandals-100
@@ -46,7 +46,8 @@ lora_fan_in_fan_out:
 sequence_len: 4096
-sample_packing: false
 pad_to_sequence_len: true
 wandb_project: llama-3-8b-hermes-sandals-first100
 wandb_entity: venetispall
@@ -95,7 +96,7 @@ special_tokens:
 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6973
 ## Model description
@@ -129,23 +130,10 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 1.1059        | 0.0833 | 1    | 0.9750          |
-| 0.7822        | 0.25   | 3    | 0.9658          |
-| 0.9619        | 0.5    | 6    | 0.8441          |
-| 0.9988        | 0.75   | 9    | 0.7469          |
-| 1.0968        | 1.0    | 12   | 0.7146          |
-| 0.9651        | 1.25   | 15   | 0.6966          |
-| 1.0208        | 1.5    | 18   | 0.6926          |
-| 0.6837        | 1.75   | 21   | 0.7035          |
-| 0.8704        | 2.0    | 24   | 0.6958          |
-| 0.4346        | 2.25   | 27   | 0.6800          |
-| 0.5709        | 2.5    | 30   | 0.6885          |
-| 0.331         | 2.75   | 33   | 0.6952          |
-| 0.5077        | 3.0    | 36   | 0.6953          |
-| 0.6047        | 3.25   | 39   | 0.6939          |
-| 0.5974        | 3.5    | 42   | 0.6961          |
-| 0.5238        | 3.75   | 45   | 0.6963          |
-| 0.7728        | 4.0    | 48   | 0.6973          |
 ### Framework versions

     type: sharegpt
     conversation: llama3
 dataset_prepared_path:
+val_set_size: 0.15
 output_dir: ./outputs_lora-out
 hub_model_id: venetis/llama3-8b-hermes-sandals-100
 sequence_len: 4096
+sample_packing: true
+eval_sample_packing: false
 pad_to_sequence_len: true
 wandb_project: llama-3-8b-hermes-sandals-first100
 wandb_entity: venetispall
 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.9990
 ## Model description
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 0.6949        | 1.0    | 1    | 1.0314          |
+| 0.3183        | 1.3333 | 2    | 1.0297          |
+| 0.7635        | 2.0    | 3    | 1.0211          |
+| 1.0254        | 2.6667 | 4    | 0.9990          |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d0ff7401a6dc3aa95a53ca2d5d6e0a5a7052c4f8390449f39ce2d1a8ec5eabf6
 size 167934026

 version https://git-lfs.github.com/spec/v1
+oid sha256:ef226f4d8644162f1b2001c61bd69c56b486320bfabfcafcf603f8e5af0e5b6c
 size 167934026