End of training

Files changed (4) hide show

README.md CHANGED Viewed

@@ -8,18 +8,18 @@ datasets:
 - generator
 base_model: NousResearch/Llama-2-7b-hf
 model-index:
-- name: llama2-7b-int4-dolly-15k-hindi-flash-attention-2-w-packing
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# llama2-7b-int4-dolly-15k-hindi-flash-attention-2-w-packing
 This model is a fine-tuned version of [NousResearch/Llama-2-7b-hf](https://huggingface.co/NousResearch/Llama-2-7b-hf) on the generator dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.2200
 ## Model description
@@ -51,10 +51,10 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 1.2692        | 0.64  | 100  | 1.2311          |
-| 1.1911        | 1.27  | 200  | 1.2219          |
-| 1.1786        | 1.91  | 300  | 1.2171          |
-| 1.1377        | 2.55  | 400  | 1.2200          |
 ### Framework versions

 - generator
 base_model: NousResearch/Llama-2-7b-hf
 model-index:
+- name: llama2-7b-int4-dolly-15k-english-flash-attention2-w-packing
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# llama2-7b-int4-dolly-15k-english-flash-attention2-w-packing
 This model is a fine-tuned version of [NousResearch/Llama-2-7b-hf](https://huggingface.co/NousResearch/Llama-2-7b-hf) on the generator dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.2201
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 1.2688        | 0.64  | 100  | 1.2310          |
+| 1.1907        | 1.27  | 200  | 1.2219          |
+| 1.178         | 1.91  | 300  | 1.2170          |
+| 1.1368        | 2.55  | 400  | 1.2201          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "alpha_pattern": {},
   "auto_mapping": null,
-  "base_model_name_or_path": null,
   "bias": "none",
   "fan_in_fan_out": false,
   "inference_mode": true,
@@ -19,9 +19,9 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "k_proj",
     "q_proj",
-    "o_proj",
     "v_proj"
   ],
   "task_type": "CAUSAL_LM",

 {
   "alpha_pattern": {},
   "auto_mapping": null,
+  "base_model_name_or_path": "NousResearch/Llama-2-7b-hf",
   "bias": "none",
   "fan_in_fan_out": false,
   "inference_mode": true,
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "o_proj",
     "k_proj",
     "q_proj",
     "v_proj"
   ],
   "task_type": "CAUSAL_LM",

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9c6fc3395801474a11d75ca2bc6cc4d780bcc91f5dfc2d1ad03afa00ceb22b32
-size 268474624

 version https://git-lfs.github.com/spec/v1
+oid sha256:b5a2b6aecd25e7361cf2b7d68ea41ba7a91054e9ca653c3883b82a1e774f282f
+size 268470272

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ebb02a0f9e6aa49806b96231b3de7fe3d95fe75babf8b4c80b7e9a035bf55571
 size 4792

 version https://git-lfs.github.com/spec/v1
+oid sha256:8c5ad94a420aaaa7f468152d39e2474600276a6d27b953ec633d7930946cb2d7
 size 4792