vdavidr
/

deepseek-coder-6.7b-instruct_En__size_52_epochs_10_2024-06-21_06-20-33_3556409

TensorBoard

Safetensors

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

vdavidr commited on Jun 21, 2024

Commit

86abc8a

verified ·

1 Parent(s): a85c025

Add readme

Browse files

Files changed (1) hide show

README.md +85 -0

README.md ADDED Viewed

	@@ -0,0 +1,85 @@

+---
+license: other
+base_model: deepseek-ai/deepseek-coder-6.7b-instruct
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+- bleu
+- sacrebleu
+- rouge
+model-index:
+- name: deepseek-coder-6.7b-instruct_En__size_52_epochs_10_2024-06-21_06-20-33_3556409
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# deepseek-coder-6.7b-instruct_En__size_52_epochs_10_2024-06-21_06-20-33_3556409
+This model is a fine-tuned version of [deepseek-ai/deepseek-coder-6.7b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.4340
+- Accuracy: 0.042
+- Chrf: 0.734
+- Bleu: 0.608
+- Sacrebleu: 0.6
+- Rouge1: 0.707
+- Rouge2: 0.494
+- Rougel: 0.637
+- Rougelsum: 0.693
+- Meteor: 0.534
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.001
+- train_batch_size: 1
+- eval_batch_size: 1
+- seed: 3407
+- distributed_type: multi-GPU
+- num_devices: 4
+- total_train_batch_size: 4
+- total_eval_batch_size: 4
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-06
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 52
+- training_steps: 520
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy | Chrf  | Bleu  | Sacrebleu | Rouge1 | Rouge2 | Rougel | Rougelsum | Meteor |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|:-----:|:-----:|:---------:|:------:|:------:|:------:|:---------:|:------:|
+| 0.1233        | 4.0   | 52   | 1.1674          | 0.027    | 0.726 | 0.601 | 0.6       | 0.681  | 0.458  | 0.612  | 0.674     | 0.539  |
+| 0.5834        | 8.0   | 104  | 1.2639          | 0.032    | 0.708 | 0.57  | 0.6       | 0.686  | 0.458  | 0.617  | 0.679     | 0.483  |
+| 0.1938        | 12.0  | 156  | 1.2723          | 0.032    | 0.708 | 0.574 | 0.6       | 0.684  | 0.457  | 0.609  | 0.673     | 0.479  |
+| 0.1681        | 16.0  | 208  | 1.2437          | 0.036    | 0.719 | 0.595 | 0.6       | 0.697  | 0.469  | 0.619  | 0.682     | 0.524  |
+| 0.176         | 20.0  | 260  | 1.4102          | 0.037    | 0.699 | 0.565 | 0.6       | 0.666  | 0.435  | 0.588  | 0.652     | 0.507  |
+| 0.4563        | 24.0  | 312  | 1.3416          | 0.039    | 0.717 | 0.586 | 0.6       | 0.69   | 0.452  | 0.609  | 0.678     | 0.521  |
+| 0.114         | 28.0  | 364  | 1.3758          | 0.041    | 0.728 | 0.602 | 0.6       | 0.703  | 0.478  | 0.618  | 0.683     | 0.524  |
+| 0.4204        | 32.0  | 416  | 1.4116          | 0.042    | 0.727 | 0.598 | 0.6       | 0.705  | 0.476  | 0.621  | 0.689     | 0.545  |
+| 0.1118        | 36.0  | 468  | 1.4229          | 0.042    | 0.734 | 0.607 | 0.6       | 0.709  | 0.497  | 0.64   | 0.694     | 0.528  |
+| 0.2482        | 40.0  | 520  | 1.4340          | 0.042    | 0.734 | 0.608 | 0.6       | 0.707  | 0.494  | 0.637  | 0.693     | 0.534  |
+### Framework versions
+- Transformers 4.37.0
+- Pytorch 2.2.1+cu121
+- Datasets 2.20.0
+- Tokenizers 0.15.2