zephyr-7b-gpo-log1-i0
This model is a fine-tuned version of alignment-handbook/zephyr-7b-sft-qlora on the HuggingFaceH4/ultrafeedback_binarized dataset. It achieves the following results on the evaluation set:
- Loss: 0.6897
- Rewards/chosen: 0.0141
- Rewards/rejected: -0.0702
- Rewards/accuracies: 0.6370
- Rewards/margins: 0.0842
- Logps/rejected: -218.6293
- Logps/chosen: -230.5992
- Logits/rejected: -2.1363
- Logits/chosen: -2.3248
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 2
- total_train_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
---|---|---|---|---|---|---|---|---|---|---|---|
0.6932 | 0.01 | 100 | 0.6931 | 0.0024 | 0.0017 | 0.4950 | 0.0007 | -211.4439 | -231.7646 | -2.1604 | -2.3488 |
0.6927 | 0.01 | 200 | 0.6928 | 0.0053 | -0.0004 | 0.5835 | 0.0057 | -211.6526 | -231.4798 | -2.1608 | -2.3492 |
0.6917 | 0.02 | 300 | 0.6925 | 0.0335 | 0.0177 | 0.5830 | 0.0159 | -209.8460 | -228.6509 | -2.1649 | -2.3535 |
0.6916 | 0.03 | 400 | 0.6920 | 0.0466 | 0.0223 | 0.6020 | 0.0244 | -209.3866 | -227.3408 | -2.1660 | -2.3548 |
0.6917 | 0.03 | 500 | 0.6916 | 0.0638 | 0.0219 | 0.6060 | 0.0419 | -209.4261 | -225.6272 | -2.1616 | -2.3499 |
0.6919 | 0.04 | 600 | 0.6913 | 0.0498 | 0.0026 | 0.5970 | 0.0472 | -211.3561 | -227.0246 | -2.1675 | -2.3568 |
0.6909 | 0.05 | 700 | 0.6913 | 0.0561 | 0.0106 | 0.6145 | 0.0455 | -210.5544 | -226.3928 | -2.1615 | -2.3501 |
0.6913 | 0.05 | 800 | 0.6913 | -0.1047 | -0.1559 | 0.5970 | 0.0512 | -227.2016 | -242.4708 | -2.1428 | -2.3307 |
0.6921 | 0.06 | 900 | 0.6909 | -0.0526 | -0.1012 | 0.6060 | 0.0486 | -221.7336 | -237.2677 | -2.1466 | -2.3343 |
0.6903 | 0.07 | 1000 | 0.6908 | -0.0008 | -0.0563 | 0.6185 | 0.0555 | -217.2371 | -232.0825 | -2.1575 | -2.3453 |
0.6922 | 0.07 | 1100 | 0.6911 | -0.0015 | -0.0779 | 0.6275 | 0.0764 | -219.4024 | -232.1565 | -2.1294 | -2.3151 |
0.6906 | 0.08 | 1200 | 0.6907 | -0.0276 | -0.0979 | 0.6375 | 0.0703 | -221.4021 | -234.7645 | -2.1398 | -2.3272 |
0.6886 | 0.09 | 1300 | 0.6907 | 0.0146 | -0.0428 | 0.6105 | 0.0574 | -215.8946 | -230.5475 | -2.1613 | -2.3501 |
0.6887 | 0.09 | 1400 | 0.6909 | 0.0072 | -0.0587 | 0.6130 | 0.0660 | -217.4851 | -231.2815 | -2.1350 | -2.3205 |
0.6887 | 0.1 | 1500 | 0.6907 | -0.0114 | -0.0845 | 0.6305 | 0.0731 | -220.0597 | -233.1405 | -2.1365 | -2.3217 |
0.6904 | 0.1 | 1600 | 0.6906 | 0.0443 | -0.0289 | 0.6260 | 0.0732 | -214.5052 | -227.5776 | -2.1414 | -2.3270 |
0.6893 | 0.11 | 1700 | 0.6904 | 0.0333 | -0.0409 | 0.6215 | 0.0742 | -215.7022 | -228.6733 | -2.1548 | -2.3421 |
0.6904 | 0.12 | 1800 | 0.6909 | 0.0409 | -0.0143 | 0.6160 | 0.0552 | -213.0369 | -227.9110 | -2.1467 | -2.3331 |
0.6908 | 0.12 | 1900 | 0.6906 | 0.0455 | -0.0171 | 0.6290 | 0.0626 | -213.3265 | -227.4577 | -2.1587 | -2.3461 |
0.6907 | 0.13 | 2000 | 0.6904 | -0.0093 | -0.0898 | 0.6400 | 0.0805 | -220.5949 | -232.9343 | -2.1672 | -2.3558 |
0.6904 | 0.14 | 2100 | 0.6905 | 0.0245 | -0.0431 | 0.6380 | 0.0676 | -215.9218 | -229.5578 | -2.1837 | -2.3738 |
0.6916 | 0.14 | 2200 | 0.6904 | -0.0211 | -0.1023 | 0.6260 | 0.0812 | -221.8438 | -234.1163 | -2.1669 | -2.3566 |
0.6913 | 0.15 | 2300 | 0.6907 | -0.0303 | -0.1156 | 0.6170 | 0.0852 | -223.1697 | -235.0393 | -2.1698 | -2.3594 |
0.6899 | 0.16 | 2400 | 0.6904 | 0.0312 | -0.0385 | 0.6225 | 0.0697 | -215.4613 | -228.8855 | -2.1472 | -2.3345 |
0.6924 | 0.16 | 2500 | 0.6905 | 0.0577 | -0.0074 | 0.625 | 0.0651 | -212.3521 | -226.2342 | -2.1658 | -2.3546 |
0.6893 | 0.17 | 2600 | 0.6903 | 0.0520 | -0.0205 | 0.6320 | 0.0725 | -213.6627 | -226.8027 | -2.1570 | -2.3453 |
0.6901 | 0.18 | 2700 | 0.6906 | 0.0038 | -0.0622 | 0.6325 | 0.0660 | -217.8366 | -231.6274 | -2.1382 | -2.3249 |
0.6909 | 0.18 | 2800 | 0.6903 | 0.0333 | -0.0363 | 0.6315 | 0.0696 | -215.2451 | -228.6795 | -2.1165 | -2.3020 |
0.6893 | 0.19 | 2900 | 0.6902 | 0.0110 | -0.0612 | 0.6380 | 0.0722 | -217.7327 | -230.9010 | -2.1110 | -2.2960 |
0.6925 | 0.2 | 3000 | 0.6903 | 0.0154 | -0.0656 | 0.6245 | 0.0811 | -218.1745 | -230.4610 | -2.1312 | -2.3182 |
0.692 | 0.2 | 3100 | 0.6903 | -0.0346 | -0.1194 | 0.6440 | 0.0849 | -223.5567 | -235.4630 | -2.1298 | -2.3160 |
0.687 | 0.21 | 3200 | 0.6903 | -0.0146 | -0.0904 | 0.6210 | 0.0757 | -220.6501 | -233.4682 | -2.1344 | -2.3212 |
0.6908 | 0.22 | 3300 | 0.6902 | -0.0061 | -0.0903 | 0.6420 | 0.0842 | -220.6434 | -232.6119 | -2.1233 | -2.3094 |
0.6908 | 0.22 | 3400 | 0.6904 | -0.0103 | -0.0884 | 0.6345 | 0.0781 | -220.4491 | -233.0300 | -2.1210 | -2.3068 |
0.6901 | 0.23 | 3500 | 0.6903 | 0.0193 | -0.0626 | 0.6355 | 0.0819 | -217.8700 | -230.0756 | -2.1193 | -2.3047 |
0.6913 | 0.24 | 3600 | 0.6902 | 0.0148 | -0.0690 | 0.6360 | 0.0838 | -218.5164 | -230.5288 | -2.1189 | -2.3041 |
0.694 | 0.24 | 3700 | 0.6904 | -0.0287 | -0.1025 | 0.6390 | 0.0738 | -221.8667 | -234.8788 | -2.0983 | -2.2820 |
0.6891 | 0.25 | 3800 | 0.6902 | 0.0450 | -0.0237 | 0.6320 | 0.0687 | -213.9806 | -227.5013 | -2.0923 | -2.2758 |
0.6877 | 0.26 | 3900 | 0.6902 | 0.0220 | -0.0570 | 0.6245 | 0.0791 | -217.3152 | -229.8009 | -2.1089 | -2.2936 |
0.6884 | 0.26 | 4000 | 0.6901 | -0.0013 | -0.0808 | 0.6360 | 0.0795 | -219.6905 | -232.1315 | -2.1064 | -2.2913 |
0.693 | 0.27 | 4100 | 0.6904 | -0.0133 | -0.0759 | 0.6280 | 0.0626 | -219.1985 | -233.3333 | -2.1177 | -2.3035 |
0.691 | 0.27 | 4200 | 0.6904 | -0.0025 | -0.0715 | 0.6360 | 0.0690 | -218.7613 | -232.2541 | -2.1112 | -2.2963 |
0.6904 | 0.28 | 4300 | 0.6901 | -0.0338 | -0.1195 | 0.6345 | 0.0858 | -223.5635 | -235.3810 | -2.1015 | -2.2866 |
0.6903 | 0.29 | 4400 | 0.6902 | -0.0454 | -0.1194 | 0.6275 | 0.0740 | -223.5494 | -236.5452 | -2.1077 | -2.2929 |
0.6864 | 0.29 | 4500 | 0.6901 | -0.0231 | -0.1063 | 0.6325 | 0.0833 | -222.2449 | -234.3118 | -2.1211 | -2.3074 |
0.6904 | 0.3 | 4600 | 0.6902 | 0.0062 | -0.0640 | 0.6310 | 0.0702 | -218.0117 | -231.3809 | -2.1215 | -2.3078 |
0.6854 | 0.31 | 4700 | 0.6903 | -0.0355 | -0.1276 | 0.6355 | 0.0921 | -224.3721 | -235.5581 | -2.1311 | -2.3193 |
0.6918 | 0.31 | 4800 | 0.6902 | -0.0179 | -0.0916 | 0.6385 | 0.0737 | -220.7675 | -233.7953 | -2.1200 | -2.3064 |
0.6886 | 0.32 | 4900 | 0.6902 | -0.0208 | -0.1097 | 0.6425 | 0.0889 | -222.5813 | -234.0859 | -2.0991 | -2.2843 |
0.6923 | 0.33 | 5000 | 0.6901 | -0.0066 | -0.0881 | 0.6270 | 0.0815 | -220.4222 | -232.6694 | -2.1010 | -2.2864 |
0.6914 | 0.33 | 5100 | 0.6902 | -0.0049 | -0.0898 | 0.6365 | 0.0849 | -220.5913 | -232.4988 | -2.1187 | -2.3049 |
0.6895 | 0.34 | 5200 | 0.6902 | -0.0224 | -0.0983 | 0.6295 | 0.0759 | -221.4422 | -234.2488 | -2.1360 | -2.3237 |
0.6928 | 0.35 | 5300 | 0.6903 | -0.0338 | -0.1157 | 0.6300 | 0.0819 | -223.1770 | -235.3836 | -2.1243 | -2.3110 |
0.689 | 0.35 | 5400 | 0.6902 | 0.0233 | -0.0513 | 0.6335 | 0.0746 | -216.7387 | -229.6749 | -2.1113 | -2.2966 |
0.6884 | 0.36 | 5500 | 0.6904 | -0.0049 | -0.0776 | 0.6230 | 0.0727 | -219.3675 | -232.4934 | -2.1054 | -2.2905 |
0.6901 | 0.37 | 5600 | 0.6903 | -0.0024 | -0.0762 | 0.6340 | 0.0738 | -219.2327 | -232.2428 | -2.1021 | -2.2871 |
0.6906 | 0.37 | 5700 | 0.6901 | 0.0148 | -0.0702 | 0.6345 | 0.0849 | -218.6294 | -230.5282 | -2.0973 | -2.2823 |
0.69 | 0.38 | 5800 | 0.6902 | -0.0196 | -0.1110 | 0.6365 | 0.0914 | -222.7126 | -233.9667 | -2.1048 | -2.2907 |
0.6907 | 0.39 | 5900 | 0.6901 | 0.0021 | -0.0814 | 0.6385 | 0.0835 | -219.7548 | -231.7942 | -2.0946 | -2.2797 |
0.6901 | 0.39 | 6000 | 0.6901 | 0.0056 | -0.0656 | 0.6295 | 0.0713 | -218.1741 | -231.4416 | -2.1236 | -2.3110 |
0.6889 | 0.4 | 6100 | 0.6901 | 0.0339 | -0.0376 | 0.6215 | 0.0716 | -215.3745 | -228.6116 | -2.1316 | -2.3196 |
0.691 | 0.41 | 6200 | 0.6900 | 0.0231 | -0.0575 | 0.6285 | 0.0806 | -217.3578 | -229.6931 | -2.1264 | -2.3146 |
0.6871 | 0.41 | 6300 | 0.6900 | 0.0432 | -0.0379 | 0.6370 | 0.0810 | -215.3970 | -227.6890 | -2.1200 | -2.3069 |
0.6892 | 0.42 | 6400 | 0.6901 | 0.0295 | -0.0619 | 0.6310 | 0.0914 | -217.7995 | -229.0562 | -2.1320 | -2.3205 |
0.6918 | 0.43 | 6500 | 0.6901 | 0.0240 | -0.0559 | 0.6370 | 0.0799 | -217.2022 | -229.6073 | -2.1407 | -2.3293 |
0.6899 | 0.43 | 6600 | 0.6901 | 0.0346 | -0.0427 | 0.6355 | 0.0773 | -215.8845 | -228.5490 | -2.1480 | -2.3373 |
0.6914 | 0.44 | 6700 | 0.6901 | 0.0006 | -0.0896 | 0.6385 | 0.0902 | -220.5701 | -231.9431 | -2.1399 | -2.3289 |
0.6921 | 0.44 | 6800 | 0.6900 | -0.0122 | -0.0949 | 0.6345 | 0.0826 | -221.0977 | -233.2272 | -2.1373 | -2.3262 |
0.6881 | 0.45 | 6900 | 0.6900 | 0.0001 | -0.0807 | 0.6310 | 0.0808 | -219.6810 | -231.9954 | -2.1336 | -2.3221 |
0.688 | 0.46 | 7000 | 0.6900 | -0.0035 | -0.0895 | 0.6255 | 0.0860 | -220.5654 | -232.3555 | -2.1330 | -2.3214 |
0.6893 | 0.46 | 7100 | 0.6900 | 0.0038 | -0.0786 | 0.6310 | 0.0824 | -219.4742 | -231.6270 | -2.1255 | -2.3129 |
0.6888 | 0.47 | 7200 | 0.6900 | 0.0146 | -0.0599 | 0.6220 | 0.0745 | -217.6021 | -230.5473 | -2.1376 | -2.3262 |
0.6907 | 0.48 | 7300 | 0.6899 | -0.0074 | -0.0859 | 0.6290 | 0.0785 | -220.2062 | -232.7456 | -2.1270 | -2.3148 |
0.6931 | 0.48 | 7400 | 0.6900 | 0.0088 | -0.0681 | 0.6285 | 0.0770 | -218.4249 | -231.1209 | -2.1238 | -2.3113 |
0.6895 | 0.49 | 7500 | 0.6899 | 0.0001 | -0.0788 | 0.6280 | 0.0789 | -219.4958 | -231.9997 | -2.1007 | -2.2861 |
0.6874 | 0.5 | 7600 | 0.6900 | -0.0044 | -0.0909 | 0.6300 | 0.0865 | -220.7033 | -232.4485 | -2.1033 | -2.2888 |
0.6898 | 0.5 | 7700 | 0.6899 | 0.0018 | -0.0817 | 0.6355 | 0.0835 | -219.7780 | -231.8252 | -2.0977 | -2.2827 |
0.6885 | 0.51 | 7800 | 0.6900 | -0.0331 | -0.1186 | 0.6485 | 0.0855 | -223.4754 | -235.3170 | -2.0865 | -2.2713 |
0.6905 | 0.52 | 7900 | 0.6899 | -0.0476 | -0.1257 | 0.6425 | 0.0781 | -224.1827 | -236.7635 | -2.0852 | -2.2699 |
0.6911 | 0.52 | 8000 | 0.6899 | -0.0329 | -0.1140 | 0.6345 | 0.0811 | -223.0114 | -235.2987 | -2.0814 | -2.2658 |
0.6915 | 0.53 | 8100 | 0.6899 | -0.0158 | -0.0964 | 0.6365 | 0.0807 | -221.2535 | -233.5811 | -2.0877 | -2.2729 |
0.6907 | 0.54 | 8200 | 0.6899 | -0.0250 | -0.1063 | 0.6355 | 0.0814 | -222.2466 | -234.5026 | -2.0843 | -2.2691 |
0.6893 | 0.54 | 8300 | 0.6900 | -0.0020 | -0.0780 | 0.6345 | 0.0760 | -219.4079 | -232.2015 | -2.0923 | -2.2778 |
0.6904 | 0.55 | 8400 | 0.6900 | 0.0123 | -0.0553 | 0.6295 | 0.0676 | -217.1386 | -230.7717 | -2.0953 | -2.2805 |
0.6885 | 0.56 | 8500 | 0.6898 | 0.0006 | -0.0852 | 0.6455 | 0.0858 | -220.1317 | -231.9455 | -2.0963 | -2.2819 |
0.6889 | 0.56 | 8600 | 0.6898 | -0.0030 | -0.0879 | 0.6410 | 0.0849 | -220.4034 | -232.3074 | -2.1033 | -2.2895 |
0.6895 | 0.57 | 8700 | 0.6898 | 0.0116 | -0.0737 | 0.6430 | 0.0853 | -218.9868 | -230.8494 | -2.1105 | -2.2970 |
0.6913 | 0.58 | 8800 | 0.6898 | 0.0296 | -0.0519 | 0.6465 | 0.0816 | -216.8063 | -229.0427 | -2.1172 | -2.3044 |
0.6906 | 0.58 | 8900 | 0.6898 | 0.0039 | -0.0875 | 0.6485 | 0.0914 | -220.3614 | -231.6156 | -2.1173 | -2.3050 |
0.6888 | 0.59 | 9000 | 0.6898 | 0.0111 | -0.0739 | 0.6400 | 0.0851 | -219.0050 | -230.8923 | -2.1196 | -2.3073 |
0.6905 | 0.6 | 9100 | 0.6899 | 0.0201 | -0.0529 | 0.6325 | 0.0730 | -216.9018 | -229.9912 | -2.1251 | -2.3129 |
0.6887 | 0.6 | 9200 | 0.6898 | 0.0207 | -0.0583 | 0.6355 | 0.0790 | -217.4442 | -229.9347 | -2.1397 | -2.3283 |
0.6899 | 0.61 | 9300 | 0.6898 | 0.0062 | -0.0796 | 0.6375 | 0.0858 | -219.5693 | -231.3830 | -2.1441 | -2.3333 |
0.6884 | 0.62 | 9400 | 0.6899 | -0.0285 | -0.1089 | 0.6335 | 0.0804 | -222.5007 | -234.8580 | -2.1432 | -2.3321 |
0.6871 | 0.62 | 9500 | 0.6898 | -0.0095 | -0.0917 | 0.6365 | 0.0822 | -220.7840 | -232.9599 | -2.1435 | -2.3324 |
0.6905 | 0.63 | 9600 | 0.6899 | 0.0203 | -0.0661 | 0.6385 | 0.0864 | -218.2251 | -229.9762 | -2.1520 | -2.3417 |
0.6895 | 0.63 | 9700 | 0.6898 | 0.0048 | -0.0783 | 0.6440 | 0.0831 | -219.4395 | -231.5201 | -2.1527 | -2.3423 |
0.6915 | 0.64 | 9800 | 0.6898 | -0.0028 | -0.0828 | 0.6420 | 0.0800 | -219.8873 | -232.2814 | -2.1416 | -2.3302 |
0.6894 | 0.65 | 9900 | 0.6898 | -0.0006 | -0.0874 | 0.6435 | 0.0867 | -220.3488 | -232.0690 | -2.1391 | -2.3274 |
0.6897 | 0.65 | 10000 | 0.6899 | -0.0191 | -0.1066 | 0.6475 | 0.0875 | -222.2716 | -233.9115 | -2.1345 | -2.3227 |
0.6859 | 0.66 | 10100 | 0.6899 | -0.0225 | -0.1068 | 0.6475 | 0.0843 | -222.2938 | -234.2563 | -2.1291 | -2.3167 |
0.6904 | 0.67 | 10200 | 0.6898 | 0.0002 | -0.0901 | 0.6475 | 0.0903 | -220.6184 | -231.9806 | -2.1274 | -2.3151 |
0.6876 | 0.67 | 10300 | 0.6898 | 0.0014 | -0.0829 | 0.6435 | 0.0843 | -219.8981 | -231.8635 | -2.1301 | -2.3181 |
0.6888 | 0.68 | 10400 | 0.6898 | 0.0178 | -0.0690 | 0.6385 | 0.0868 | -218.5098 | -230.2225 | -2.1290 | -2.3170 |
0.6893 | 0.69 | 10500 | 0.6898 | 0.0209 | -0.0629 | 0.6395 | 0.0838 | -217.9021 | -229.9178 | -2.1322 | -2.3205 |
0.6893 | 0.69 | 10600 | 0.6898 | 0.0157 | -0.0686 | 0.6430 | 0.0844 | -218.4735 | -230.4310 | -2.1292 | -2.3171 |
0.6907 | 0.7 | 10700 | 0.6898 | 0.0165 | -0.0682 | 0.6430 | 0.0847 | -218.4280 | -230.3552 | -2.1293 | -2.3170 |
0.6877 | 0.71 | 10800 | 0.6898 | 0.0264 | -0.0554 | 0.6435 | 0.0818 | -217.1490 | -229.3606 | -2.1293 | -2.3171 |
0.6924 | 0.71 | 10900 | 0.6898 | 0.0120 | -0.0670 | 0.6385 | 0.0790 | -218.3147 | -230.8059 | -2.1238 | -2.3111 |
0.691 | 0.72 | 11000 | 0.6898 | 0.0266 | -0.0537 | 0.6395 | 0.0803 | -216.9807 | -229.3445 | -2.1251 | -2.3125 |
0.6903 | 0.73 | 11100 | 0.6898 | 0.0312 | -0.0491 | 0.6360 | 0.0803 | -216.5214 | -228.8819 | -2.1258 | -2.3132 |
0.6918 | 0.73 | 11200 | 0.6898 | 0.0305 | -0.0499 | 0.6375 | 0.0804 | -216.6021 | -228.9509 | -2.1260 | -2.3134 |
0.6879 | 0.74 | 11300 | 0.6898 | 0.0205 | -0.0612 | 0.6380 | 0.0818 | -217.7365 | -229.9544 | -2.1278 | -2.3155 |
0.6896 | 0.75 | 11400 | 0.6898 | 0.0170 | -0.0694 | 0.6355 | 0.0864 | -218.5536 | -230.3058 | -2.1292 | -2.3172 |
0.6904 | 0.75 | 11500 | 0.6898 | 0.0200 | -0.0610 | 0.6295 | 0.0811 | -217.7165 | -230.0003 | -2.1303 | -2.3183 |
0.6891 | 0.76 | 11600 | 0.6898 | 0.0093 | -0.0783 | 0.6370 | 0.0877 | -219.4468 | -231.0702 | -2.1269 | -2.3147 |
0.6883 | 0.77 | 11700 | 0.6898 | 0.0024 | -0.0805 | 0.6355 | 0.0828 | -219.6586 | -231.7671 | -2.1296 | -2.3175 |
0.69 | 0.77 | 11800 | 0.6898 | -0.0053 | -0.0871 | 0.6410 | 0.0818 | -220.3198 | -232.5302 | -2.1311 | -2.3192 |
0.6871 | 0.78 | 11900 | 0.6898 | -0.0076 | -0.0914 | 0.6410 | 0.0838 | -220.7492 | -232.7632 | -2.1300 | -2.3180 |
0.6887 | 0.79 | 12000 | 0.6898 | -0.0020 | -0.0869 | 0.6420 | 0.0849 | -220.3020 | -232.2003 | -2.1329 | -2.3212 |
0.6881 | 0.79 | 12100 | 0.6898 | 0.0007 | -0.0815 | 0.6385 | 0.0822 | -219.7614 | -231.9368 | -2.1346 | -2.3230 |
0.6905 | 0.8 | 12200 | 0.6898 | 0.0116 | -0.0698 | 0.6340 | 0.0814 | -218.5900 | -230.8437 | -2.1335 | -2.3217 |
0.6915 | 0.8 | 12300 | 0.6898 | 0.0068 | -0.0793 | 0.6365 | 0.0861 | -219.5374 | -231.3238 | -2.1342 | -2.3226 |
0.6927 | 0.81 | 12400 | 0.6898 | 0.0117 | -0.0703 | 0.6350 | 0.0820 | -218.6442 | -230.8355 | -2.1361 | -2.3246 |
0.6897 | 0.82 | 12500 | 0.6898 | 0.0095 | -0.0713 | 0.6325 | 0.0807 | -218.7409 | -231.0591 | -2.1371 | -2.3257 |
0.6905 | 0.82 | 12600 | 0.6898 | 0.0061 | -0.0744 | 0.6365 | 0.0805 | -219.0518 | -231.3977 | -2.1376 | -2.3263 |
0.6905 | 0.83 | 12700 | 0.6898 | 0.0062 | -0.0754 | 0.6335 | 0.0815 | -219.1471 | -231.3857 | -2.1376 | -2.3263 |
0.6907 | 0.84 | 12800 | 0.6898 | 0.0129 | -0.0688 | 0.6360 | 0.0817 | -218.4943 | -230.7170 | -2.1390 | -2.3279 |
0.6911 | 0.84 | 12900 | 0.6897 | 0.0182 | -0.0653 | 0.6335 | 0.0835 | -218.1457 | -230.1887 | -2.1372 | -2.3259 |
0.6886 | 0.85 | 13000 | 0.6897 | 0.0149 | -0.0707 | 0.6365 | 0.0856 | -218.6831 | -230.5150 | -2.1390 | -2.3278 |
0.6914 | 0.86 | 13100 | 0.6897 | 0.0135 | -0.0701 | 0.6355 | 0.0836 | -218.6235 | -230.6533 | -2.1373 | -2.3260 |
0.6887 | 0.86 | 13200 | 0.6897 | 0.0112 | -0.0734 | 0.6370 | 0.0846 | -218.9507 | -230.8813 | -2.1367 | -2.3253 |
0.6891 | 0.87 | 13300 | 0.6897 | 0.0125 | -0.0733 | 0.6405 | 0.0858 | -218.9421 | -230.7573 | -2.1360 | -2.3246 |
0.6913 | 0.88 | 13400 | 0.6897 | 0.0152 | -0.0698 | 0.6305 | 0.0850 | -218.5887 | -230.4858 | -2.1379 | -2.3267 |
0.6912 | 0.88 | 13500 | 0.6897 | 0.0194 | -0.0641 | 0.6360 | 0.0836 | -218.0252 | -230.0619 | -2.1378 | -2.3265 |
0.6905 | 0.89 | 13600 | 0.6897 | 0.0163 | -0.0690 | 0.6380 | 0.0853 | -218.5100 | -230.3711 | -2.1382 | -2.3269 |
0.6913 | 0.9 | 13700 | 0.6897 | 0.0172 | -0.0673 | 0.6360 | 0.0846 | -218.3449 | -230.2803 | -2.1379 | -2.3266 |
0.69 | 0.9 | 13800 | 0.6897 | 0.0175 | -0.0677 | 0.6390 | 0.0851 | -218.3797 | -230.2597 | -2.1379 | -2.3266 |
0.6902 | 0.91 | 13900 | 0.6897 | 0.0181 | -0.0668 | 0.6400 | 0.0849 | -218.2959 | -230.1951 | -2.1371 | -2.3257 |
0.6883 | 0.92 | 14000 | 0.6897 | 0.0142 | -0.0709 | 0.6380 | 0.0851 | -218.7007 | -230.5817 | -2.1376 | -2.3262 |
0.6898 | 0.92 | 14100 | 0.6897 | 0.0158 | -0.0685 | 0.6375 | 0.0844 | -218.4662 | -230.4218 | -2.1366 | -2.3252 |
0.6894 | 0.93 | 14200 | 0.6897 | 0.0149 | -0.0698 | 0.6375 | 0.0847 | -218.5941 | -230.5171 | -2.1369 | -2.3255 |
0.6912 | 0.94 | 14300 | 0.6897 | 0.0145 | -0.0702 | 0.6400 | 0.0847 | -218.6314 | -230.5508 | -2.1365 | -2.3251 |
0.6893 | 0.94 | 14400 | 0.6897 | 0.0139 | -0.0710 | 0.6410 | 0.0848 | -218.7085 | -230.6183 | -2.1361 | -2.3247 |
0.6914 | 0.95 | 14500 | 0.6897 | 0.0139 | -0.0710 | 0.6370 | 0.0848 | -218.7070 | -230.6179 | -2.1364 | -2.3250 |
0.6897 | 0.96 | 14600 | 0.6897 | 0.0138 | -0.0707 | 0.6355 | 0.0844 | -218.6777 | -230.6268 | -2.1363 | -2.3249 |
0.691 | 0.96 | 14700 | 0.6897 | 0.0138 | -0.0705 | 0.6365 | 0.0843 | -218.6600 | -230.6252 | -2.1362 | -2.3248 |
0.6897 | 0.97 | 14800 | 0.6897 | 0.0139 | -0.0705 | 0.6340 | 0.0844 | -218.6653 | -230.6136 | -2.1364 | -2.3250 |
0.6892 | 0.97 | 14900 | 0.6897 | 0.0138 | -0.0703 | 0.6380 | 0.0841 | -218.6449 | -230.6241 | -2.1365 | -2.3250 |
0.6925 | 0.98 | 15000 | 0.6897 | 0.0142 | -0.0701 | 0.6385 | 0.0843 | -218.6228 | -230.5896 | -2.1369 | -2.3255 |
0.6882 | 0.99 | 15100 | 0.6897 | 0.0141 | -0.0701 | 0.6390 | 0.0843 | -218.6257 | -230.5937 | -2.1369 | -2.3255 |
0.6896 | 0.99 | 15200 | 0.6897 | 0.0141 | -0.0701 | 0.6365 | 0.0842 | -218.6245 | -230.5999 | -2.1366 | -2.3251 |
Framework versions
- PEFT 0.7.1
- Transformers 4.36.2
- Pytorch 2.1.2+cu121
- Datasets 2.14.6
- Tokenizers 0.15.2
- Downloads last month
- 12
Model tree for DUAL-GPO-2/zephyr-7b-gpo-log-i0
Base model
mistralai/Mistral-7B-v0.1