SuperSecureHuman commited on
Commit
9fb2633
1 Parent(s): afffb72

Model save

Browse files
Files changed (1) hide show
  1. README.md +27 -19
README.md CHANGED
@@ -1,9 +1,9 @@
1
  ---
2
- license: mit
3
  library_name: peft
4
  tags:
5
  - generated_from_trainer
6
- base_model: microsoft/phi-2
7
  model-index:
8
  - name: vendata-train
9
  results: []
@@ -14,9 +14,9 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  # vendata-train
16
 
17
- This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 1.1862
20
 
21
  ## Model description
22
 
@@ -35,32 +35,40 @@ More information needed
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
- - learning_rate: 0.0001
39
- - train_batch_size: 8
40
- - eval_batch_size: 8
41
  - seed: 42
42
- - gradient_accumulation_steps: 4
43
- - total_train_batch_size: 32
 
 
 
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: cosine
46
  - lr_scheduler_warmup_ratio: 0.1
47
- - training_steps: 500
48
 
49
  ### Training results
50
 
51
  | Training Loss | Epoch | Step | Validation Loss |
52
  |:-------------:|:-----:|:----:|:---------------:|
53
- | 1.442 | 0.2 | 100 | 1.3003 |
54
- | 1.3957 | 0.4 | 200 | 1.2327 |
55
- | 1.2733 | 0.6 | 300 | 1.2029 |
56
- | 1.4089 | 0.8 | 400 | 1.1889 |
57
- | 1.2397 | 1.0 | 500 | 1.1862 |
 
 
 
 
 
58
 
59
 
60
  ### Framework versions
61
 
62
  - PEFT 0.8.2
63
- - Transformers 4.37.2
64
- - Pytorch 2.1.0+cu121
65
- - Datasets 2.17.1
66
- - Tokenizers 0.15.2
 
1
  ---
2
+ license: llama2
3
  library_name: peft
4
  tags:
5
  - generated_from_trainer
6
+ base_model: codellama/CodeLlama-7b-Instruct-hf
7
  model-index:
8
  - name: vendata-train
9
  results: []
 
14
 
15
  # vendata-train
16
 
17
+ This model is a fine-tuned version of [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.9052
20
 
21
  ## Model description
22
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
+ - learning_rate: 2e-05
39
+ - train_batch_size: 4
40
+ - eval_batch_size: 4
41
  - seed: 42
42
+ - distributed_type: multi-GPU
43
+ - num_devices: 2
44
+ - gradient_accumulation_steps: 8
45
+ - total_train_batch_size: 64
46
+ - total_eval_batch_size: 8
47
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
  - lr_scheduler_type: cosine
49
  - lr_scheduler_warmup_ratio: 0.1
50
+ - training_steps: 100
51
 
52
  ### Training results
53
 
54
  | Training Loss | Epoch | Step | Validation Loss |
55
  |:-------------:|:-----:|:----:|:---------------:|
56
+ | 1.0633 | 0.1 | 10 | 1.0362 |
57
+ | 1.2685 | 0.2 | 20 | 0.9920 |
58
+ | 1.2542 | 0.3 | 30 | 0.9562 |
59
+ | 1.1031 | 0.4 | 40 | 0.9356 |
60
+ | 1.0196 | 0.5 | 50 | 0.9224 |
61
+ | 0.9397 | 0.6 | 60 | 0.9140 |
62
+ | 0.9485 | 0.7 | 70 | 0.9091 |
63
+ | 0.9506 | 0.8 | 80 | 0.9064 |
64
+ | 0.978 | 0.9 | 90 | 0.9054 |
65
+ | 1.0167 | 1.0 | 100 | 0.9052 |
66
 
67
 
68
  ### Framework versions
69
 
70
  - PEFT 0.8.2
71
+ - Transformers 4.36.2
72
+ - Pytorch 2.1.2
73
+ - Datasets 2.16.1
74
+ - Tokenizers 0.15.0