vdaita
/

diff-deepseek-ellipsis

Generated from Trainer

8-bit precision

Model card Files Files and versions Community

vdaita commited on Jul 6, 2024

Commit

4c7df4f

·

verified ·

1 Parent(s): 4549a9a

End of training

Files changed (2) hide show

README.md +5 -6
adapter_model.bin +1 -1

README.md CHANGED Viewed

@@ -107,7 +107,7 @@ special_tokens:
 This model is a fine-tuned version of [deepseek-ai/deepseek-coder-6.7b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0202
 ## Model description
@@ -144,11 +144,10 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.2054        | 0.02  | 1    | 0.2354          |
-| 0.062         | 0.25  | 15   | 0.0651          |
-| 0.0333        | 0.5   | 30   | 0.0370          |
-| 0.0215        | 0.75  | 45   | 0.0218          |
-| 0.0174        | 1.0   | 60   | 0.0202          |
 ### Framework versions

 This model is a fine-tuned version of [deepseek-ai/deepseek-coder-6.7b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1634
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.3241        | 0.02  | 1    | 0.3550          |
+| 0.2785        | 0.25  | 11   | 0.2303          |
+| 0.2129        | 0.51  | 22   | 0.1771          |
+| 0.1803        | 0.76  | 33   | 0.1634          |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:93937c3dd5f3a3538a2062a256e4416162f68ea8decad2f081fd608f1ae1eb64
 size 848460690

 version https://git-lfs.github.com/spec/v1
+oid sha256:3819cb5d5e46941f03a7f51ca30705ee8c8cb14dc6cf5ef1b056b94e2798cde2
 size 848460690