End of training

Browse files

Files changed (13) hide show

README.md +43 -43
config.json +1 -1
final_checkpoint/config.json +1 -1
final_checkpoint/generation_config.json +1 -1
final_checkpoint/model-00001-of-00003.safetensors +1 -1
final_checkpoint/model-00002-of-00003.safetensors +1 -1
final_checkpoint/model-00003-of-00003.safetensors +1 -1
generation_config.json +1 -1
model-00001-of-00003.safetensors +1 -1
model-00002-of-00003.safetensors +1 -1
model-00003-of-00003.safetensors +1 -1
tokenizer.json +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.7893
 ## Model description
@@ -51,51 +51,51 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
-| 2.3115        | 0.3333  | 25   | 2.3023          |
-| 2.3052        | 0.6667  | 50   | 2.2848          |
-| 2.2536        | 1.0     | 75   | 2.2429          |
-| 2.1452        | 1.3333  | 100  | 2.1703          |
-| 2.0776        | 1.6667  | 125  | 2.1017          |
-| 1.9661        | 2.0     | 150  | 2.0450          |
-| 1.9467        | 2.3333  | 175  | 2.0002          |
-| 1.9721        | 2.6667  | 200  | 1.9619          |
-| 1.958         | 3.0     | 225  | 1.9330          |
-| 1.8232        | 3.3333  | 250  | 1.9097          |
-| 1.79          | 3.6667  | 275  | 1.8897          |
-| 1.8144        | 4.0     | 300  | 1.8728          |
-| 1.8022        | 4.3333  | 325  | 1.8594          |
-| 1.7519        | 4.6667  | 350  | 1.8478          |
-| 1.7756        | 5.0     | 375  | 1.8375          |
-| 1.7236        | 5.3333  | 400  | 1.8291          |
-| 1.7251        | 5.6667  | 425  | 1.8218          |
-| 1.6927        | 6.0     | 450  | 1.8160          |
-| 1.6903        | 6.3333  | 475  | 1.8110          |
-| 1.6437        | 6.6667  | 500  | 1.8065          |
-| 1.7401        | 7.0     | 525  | 1.8024          |
-| 1.7231        | 7.3333  | 550  | 1.7999          |
-| 1.6817        | 7.6667  | 575  | 1.7973          |
-| 1.6488        | 8.0     | 600  | 1.7953          |
-| 1.6631        | 8.3333  | 625  | 1.7938          |
-| 1.5865        | 8.6667  | 650  | 1.7922          |
-| 1.7452        | 9.0     | 675  | 1.7913          |
-| 1.6411        | 9.3333  | 700  | 1.7904          |
-| 1.678         | 9.6667  | 725  | 1.7905          |
-| 1.7495        | 10.0    | 750  | 1.7900          |
-| 1.6078        | 10.3333 | 775  | 1.7896          |
-| 1.5673        | 10.6667 | 800  | 1.7892          |
-| 1.7028        | 11.0    | 825  | 1.7893          |
-| 1.6971        | 11.3333 | 850  | 1.7894          |
-| 1.6289        | 11.6667 | 875  | 1.7895          |
-| 1.66          | 12.0    | 900  | 1.7893          |
-| 1.6261        | 12.3333 | 925  | 1.7893          |
-| 1.6062        | 12.6667 | 950  | 1.7893          |
-| 1.6743        | 13.0    | 975  | 1.7893          |
-| 1.6396        | 13.3333 | 1000 | 1.7893          |
 ### Framework versions
-- Transformers 4.41.1
 - Pytorch 2.0.0+cu117
-- Datasets 2.19.1
 - Tokenizers 0.19.1

 This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.1047
 ## Model description
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
+| 2.4021        | 0.3333  | 25   | 2.3941          |
+| 2.3235        | 0.6667  | 50   | 2.2471          |
+| 2.0863        | 1.0     | 75   | 1.9386          |
+| 1.6662        | 1.3333  | 100  | 1.5791          |
+| 1.2956        | 1.6667  | 125  | 1.2544          |
+| 1.214         | 2.0     | 150  | 1.2116          |
+| 1.202         | 2.3333  | 175  | 1.1861          |
+| 1.1813        | 2.6667  | 200  | 1.1668          |
+| 1.1696        | 3.0     | 225  | 1.1528          |
+| 1.1052        | 3.3333  | 250  | 1.1412          |
+| 1.0614        | 3.6667  | 275  | 1.1329          |
+| 1.1106        | 4.0     | 300  | 1.1271          |
+| 1.1019        | 4.3333  | 325  | 1.1228          |
+| 1.0691        | 4.6667  | 350  | 1.1212          |
+| 1.0947        | 5.0     | 375  | 1.1153          |
+| 1.0689        | 5.3333  | 400  | 1.1134          |
+| 1.0598        | 5.6667  | 425  | 1.1116          |
+| 1.0459        | 6.0     | 450  | 1.1111          |
+| 1.0518        | 6.3333  | 475  | 1.1097          |
+| 1.045         | 6.6667  | 500  | 1.1092          |
+| 1.0658        | 7.0     | 525  | 1.1066          |
+| 1.0706        | 7.3333  | 550  | 1.1067          |
+| 1.0514        | 7.6667  | 575  | 1.1057          |
+| 1.0412        | 8.0     | 600  | 1.1063          |
+| 1.0455        | 8.3333  | 625  | 1.1052          |
+| 0.9657        | 8.6667  | 650  | 1.1057          |
+| 1.1015        | 9.0     | 675  | 1.1052          |
+| 1.0294        | 9.3333  | 700  | 1.1051          |
+| 1.0399        | 9.6667  | 725  | 1.1052          |
+| 1.1125        | 10.0    | 750  | 1.1047          |
+| 1.0219        | 10.3333 | 775  | 1.1046          |
+| 0.9862        | 10.6667 | 800  | 1.1048          |
+| 1.0682        | 11.0    | 825  | 1.1049          |
+| 1.0587        | 11.3333 | 850  | 1.1049          |
+| 1.0217        | 11.6667 | 875  | 1.1051          |
+| 1.0547        | 12.0    | 900  | 1.1047          |
+| 1.0047        | 12.3333 | 925  | 1.1047          |
+| 1.021         | 12.6667 | 950  | 1.1047          |
+| 1.0528        | 13.0    | 975  | 1.1047          |
+| 1.0385        | 13.3333 | 1000 | 1.1047          |
 ### Framework versions
+- Transformers 4.41.2
 - Pytorch 2.0.0+cu117
+- Datasets 2.19.2
 - Tokenizers 0.19.1

config.json CHANGED Viewed

@@ -20,7 +20,7 @@
   "sliding_window": null,
   "tie_word_embeddings": false,
   "torch_dtype": "float16",
-  "transformers_version": "4.41.1",
   "use_cache": false,
   "vocab_size": 32000
 }

   "sliding_window": null,
   "tie_word_embeddings": false,
   "torch_dtype": "float16",
+  "transformers_version": "4.41.2",
   "use_cache": false,
   "vocab_size": 32000
 }

final_checkpoint/config.json CHANGED Viewed

@@ -20,7 +20,7 @@
   "sliding_window": null,
   "tie_word_embeddings": false,
   "torch_dtype": "float16",
-  "transformers_version": "4.41.1",
   "use_cache": false,
   "vocab_size": 32000
 }

   "sliding_window": null,
   "tie_word_embeddings": false,
   "torch_dtype": "float16",
+  "transformers_version": "4.41.2",
   "use_cache": false,
   "vocab_size": 32000
 }

final_checkpoint/generation_config.json CHANGED Viewed

@@ -2,5 +2,5 @@
   "_from_model_config": true,
   "bos_token_id": 1,
   "eos_token_id": 2,
-  "transformers_version": "4.41.1"
 }

   "_from_model_config": true,
   "bos_token_id": 1,
   "eos_token_id": 2,
+  "transformers_version": "4.41.2"
 }

final_checkpoint/model-00001-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6e28dea1259a4fe27dfc1c5ccad31ee48ce8106a7c63f5adcb2d74008400cf35
 size 4943162240

 version https://git-lfs.github.com/spec/v1
+oid sha256:9aa2e9687a5e5d24a999a996e9fe4c2bc1cf34ad347da5dc5c7e0adffcb14982
 size 4943162240

final_checkpoint/model-00002-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:379a9c38347b54ee796533debd8fe7c509461e679ee37c47a77ab046ab3843e3
 size 4999819232

 version https://git-lfs.github.com/spec/v1
+oid sha256:268bb18cc8bbff53c912fa3961a6281dd5c163edd1b8e5c85c9b12e87e4e3a63
 size 4999819232

final_checkpoint/model-00003-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d9e3264da70cef319725ed8eab62a3bd01e6e60ee8ec2a9d260765987e706c68
 size 4540516256

 version https://git-lfs.github.com/spec/v1
+oid sha256:bbc021dcf68d9e7ddaab0ead255721e73b7f652e3bfd34985bba6c029e0b729c
 size 4540516256

generation_config.json CHANGED Viewed

@@ -2,5 +2,5 @@
   "_from_model_config": true,
   "bos_token_id": 1,
   "eos_token_id": 2,
-  "transformers_version": "4.41.1"
 }

   "_from_model_config": true,
   "bos_token_id": 1,
   "eos_token_id": 2,
+  "transformers_version": "4.41.2"
 }

model-00001-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6e28dea1259a4fe27dfc1c5ccad31ee48ce8106a7c63f5adcb2d74008400cf35
 size 4943162240

 version https://git-lfs.github.com/spec/v1
+oid sha256:9aa2e9687a5e5d24a999a996e9fe4c2bc1cf34ad347da5dc5c7e0adffcb14982
 size 4943162240

model-00002-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:379a9c38347b54ee796533debd8fe7c509461e679ee37c47a77ab046ab3843e3
 size 4999819232

 version https://git-lfs.github.com/spec/v1
+oid sha256:268bb18cc8bbff53c912fa3961a6281dd5c163edd1b8e5c85c9b12e87e4e3a63
 size 4999819232

model-00003-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d9e3264da70cef319725ed8eab62a3bd01e6e60ee8ec2a9d260765987e706c68
 size 4540516256

 version https://git-lfs.github.com/spec/v1
+oid sha256:bbc021dcf68d9e7ddaab0ead255721e73b7f652e3bfd34985bba6c029e0b729c
 size 4540516256

tokenizer.json CHANGED Viewed

@@ -2,7 +2,7 @@
   "version": "1.0",
   "truncation": {
     "direction": "Right",
-    "max_length": 100,
     "strategy": "LongestFirst",
     "stride": 0
   },

   "version": "1.0",
   "truncation": {
     "direction": "Right",
+    "max_length": 1024,
     "strategy": "LongestFirst",
     "stride": 0
   },

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:115ab8ced82e34c8578a7fd2b09fb09ed633126905bc52f9cfc3832842de2535
 size 4603

 version https://git-lfs.github.com/spec/v1
+oid sha256:9c1e679e02136ed9a355b147a03ca6fee604b4360a056e070744e6783bbae220
 size 4603