NicholasCorrado commited on
Commit
69c9d49
1 Parent(s): 5ee7120

Model save

Browse files
Files changed (4) hide show
  1. README.md +94 -0
  2. all_results.json +9 -0
  3. generation_config.json +6 -0
  4. train_results.json +9 -0
README.md ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ base_model: alignment-handbook/zephyr-7b-sft-full
5
+ tags:
6
+ - trl
7
+ - dpo
8
+ - generated_from_trainer
9
+ model-index:
10
+ - name: zephyr-7b-uf-rlced-conifer-group-dpo-2e
11
+ results: []
12
+ ---
13
+
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
16
+
17
+ # zephyr-7b-uf-rlced-conifer-group-dpo-2e
18
+
19
+ This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on an unknown dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 0.2410
22
+ - Rewards/chosen: -3.4515
23
+ - Rewards/rejected: -8.7505
24
+ - Rewards/accuracies: 0.8769
25
+ - Rewards/margins: 5.2990
26
+ - Logps/rejected: -1278.7848
27
+ - Logps/chosen: -737.6204
28
+ - Logits/rejected: 3.0507
29
+ - Logits/chosen: 0.9407
30
+ - Alpha0: 0.6369
31
+ - Alpha1: 0.3631
32
+ - Task Loss1: 0.1726
33
+ - Task Excess Loss1: 0.0379
34
+ - Excess Loss: 0.0341
35
+ - Task Loss0: 0.5306
36
+ - Task Excess Loss0: 0.0889
37
+
38
+ ## Model description
39
+
40
+ More information needed
41
+
42
+ ## Intended uses & limitations
43
+
44
+ More information needed
45
+
46
+ ## Training and evaluation data
47
+
48
+ More information needed
49
+
50
+ ## Training procedure
51
+
52
+ ### Training hyperparameters
53
+
54
+ The following hyperparameters were used during training:
55
+ - learning_rate: 5e-07
56
+ - train_batch_size: 8
57
+ - eval_batch_size: 8
58
+ - seed: 42
59
+ - distributed_type: multi-GPU
60
+ - num_devices: 8
61
+ - gradient_accumulation_steps: 4
62
+ - total_train_batch_size: 256
63
+ - total_eval_batch_size: 64
64
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
65
+ - lr_scheduler_type: cosine
66
+ - lr_scheduler_warmup_ratio: 0.1
67
+ - num_epochs: 2
68
+
69
+ ### Training results
70
+
71
+ | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Alpha0 | Alpha1 | Task Loss1 | Task Excess Loss1 | Excess Loss | Task Loss0 | Task Excess Loss0 |
72
+ |:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:------:|:------:|:----------:|:-----------------:|:-----------:|:----------:|:-----------------:|
73
+ | 0.3541 | 0.1388 | 100 | 0.4194 | -1.3743 | -2.6267 | 0.8102 | 1.2524 | -666.4093 | -529.9026 | -2.7580 | -2.7843 | 0.8214 | 0.1786 | 0.3373 | 0.1973 | 0.1899 | 0.6883 | 0.2655 |
74
+ | 0.2214 | 0.2776 | 200 | 0.3480 | -1.2450 | -2.9488 | 0.8412 | 1.7038 | -698.6146 | -516.9692 | 0.1216 | -0.2174 | 0.8786 | 0.1214 | 0.2866 | 0.1517 | 0.1250 | 0.5355 | 0.0929 |
75
+ | 0.2284 | 0.4164 | 300 | 0.3271 | -1.7298 | -3.6279 | 0.8515 | 1.8981 | -766.5247 | -565.4502 | 1.3769 | 0.5823 | 0.6417 | 0.3583 | 0.2721 | 0.1383 | 0.1130 | 0.5406 | 0.0794 |
76
+ | 0.1837 | 0.5552 | 400 | 0.3040 | -1.7232 | -4.0037 | 0.8553 | 2.2805 | -804.1021 | -564.7872 | 1.8300 | 0.7862 | 0.7891 | 0.2109 | 0.2517 | 0.1159 | 0.0949 | 0.5490 | 0.0796 |
77
+ | 0.1749 | 0.6940 | 500 | 0.2966 | -1.7976 | -4.1927 | 0.8637 | 2.3951 | -823.0039 | -572.2305 | 1.7164 | 0.5785 | 0.8057 | 0.1943 | 0.2448 | 0.1097 | 0.0856 | 0.5124 | 0.0570 |
78
+ | 0.1823 | 0.8328 | 600 | 0.3030 | -1.7187 | -3.9261 | 0.8647 | 2.2074 | -796.3432 | -564.3366 | 2.4921 | 1.3988 | 0.9053 | 0.0947 | 0.2541 | 0.1193 | 0.0922 | 0.5047 | 0.0596 |
79
+ | 0.1766 | 0.9715 | 700 | 0.2895 | -1.6400 | -4.2369 | 0.8647 | 2.5969 | -827.4293 | -556.4711 | 1.6749 | 0.1680 | 0.9622 | 0.0378 | 0.2417 | 0.1057 | 0.0812 | 0.5020 | 0.0532 |
80
+ | 0.1131 | 1.1103 | 800 | 0.2646 | -2.7794 | -6.7040 | 0.8647 | 3.9245 | -1074.1326 | -670.4117 | 2.3249 | 0.3844 | 0.0325 | 0.9675 | 0.1990 | 0.0653 | 0.0567 | 0.5372 | 0.0871 |
81
+ | 0.1006 | 1.2491 | 900 | 0.2490 | -3.6465 | -8.6692 | 0.8712 | 5.0227 | -1270.6554 | -757.1147 | 3.3211 | 1.0777 | 0.4760 | 0.5240 | 0.1852 | 0.0492 | 0.0420 | 0.5341 | 0.0967 |
82
+ | 0.0951 | 1.3879 | 1000 | 0.2470 | -3.0354 | -7.7369 | 0.8797 | 4.7015 | -1177.4214 | -696.0082 | 3.1614 | 0.9199 | 0.0150 | 0.9850 | 0.1756 | 0.0450 | 0.0382 | 0.5249 | 0.0834 |
83
+ | 0.0885 | 1.5267 | 1100 | 0.2435 | -3.4543 | -8.4740 | 0.8731 | 5.0197 | -1251.1321 | -737.8961 | 3.4589 | 1.3892 | 0.0151 | 0.9849 | 0.1747 | 0.0421 | 0.0368 | 0.5310 | 0.0887 |
84
+ | 0.1003 | 1.6655 | 1200 | 0.2416 | -3.3615 | -8.4285 | 0.875 | 5.0670 | -1246.5889 | -728.6184 | 2.9341 | 0.9100 | 0.0721 | 0.9279 | 0.1730 | 0.0396 | 0.0352 | 0.5285 | 0.0863 |
85
+ | 0.0865 | 1.8043 | 1300 | 0.2412 | -3.3114 | -8.4737 | 0.8769 | 5.1623 | -1251.1091 | -723.6140 | 2.9432 | 0.8628 | 0.0755 | 0.9245 | 0.1734 | 0.0388 | 0.0343 | 0.5272 | 0.0847 |
86
+ | 0.0893 | 1.9431 | 1400 | 0.2410 | -3.4515 | -8.7505 | 0.8769 | 5.2990 | -1278.7848 | -737.6204 | 3.0507 | 0.9407 | 0.6369 | 0.3631 | 0.1726 | 0.0379 | 0.0341 | 0.5306 | 0.0889 |
87
+
88
+
89
+ ### Framework versions
90
+
91
+ - Transformers 4.44.1
92
+ - Pytorch 2.1.2+cu121
93
+ - Datasets 2.21.0
94
+ - Tokenizers 0.19.1
all_results.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1.9986120749479528,
3
+ "total_flos": 0.0,
4
+ "train_loss": 0.17575526105033026,
5
+ "train_runtime": 46867.94,
6
+ "train_samples": 184443,
7
+ "train_samples_per_second": 7.871,
8
+ "train_steps_per_second": 0.031
9
+ }
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 1,
4
+ "eos_token_id": 2,
5
+ "transformers_version": "4.44.1"
6
+ }
train_results.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1.9986120749479528,
3
+ "total_flos": 0.0,
4
+ "train_loss": 0.17575526105033026,
5
+ "train_runtime": 46867.94,
6
+ "train_samples": 184443,
7
+ "train_samples_per_second": 7.871,
8
+ "train_steps_per_second": 0.031
9
+ }