aisuko commited on
Commit
cb9b7f6
1 Parent(s): 45b4df3

End of training

Browse files
Files changed (3) hide show
  1. README.md +17 -1
  2. model.safetensors +1 -1
  3. training_args.bin +1 -1
README.md CHANGED
@@ -1,10 +1,10 @@
1
  ---
2
  license: other
 
3
  tags:
4
  - trl
5
  - orpo
6
  - generated_from_trainer
7
- base_model: apple/OpenELM-270M
8
  model-index:
9
  - name: ft-openelm-270m-ultrafeedback
10
  results: []
@@ -16,6 +16,19 @@ should probably proofread and complete it, then remove this comment. -->
16
  # ft-openelm-270m-ultrafeedback
17
 
18
  This model is a fine-tuned version of [apple/OpenELM-270M](https://huggingface.co/apple/OpenELM-270M) on the HuggingFaceH4/ultrafeedback_binarized dataset.
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  ## Model description
21
 
@@ -47,6 +60,9 @@ The following hyperparameters were used during training:
47
 
48
  ### Training results
49
 
 
 
 
50
 
51
 
52
  ### Framework versions
 
1
  ---
2
  license: other
3
+ base_model: apple/OpenELM-270M
4
  tags:
5
  - trl
6
  - orpo
7
  - generated_from_trainer
 
8
  model-index:
9
  - name: ft-openelm-270m-ultrafeedback
10
  results: []
 
16
  # ft-openelm-270m-ultrafeedback
17
 
18
  This model is a fine-tuned version of [apple/OpenELM-270M](https://huggingface.co/apple/OpenELM-270M) on the HuggingFaceH4/ultrafeedback_binarized dataset.
19
+ It achieves the following results on the evaluation set:
20
+ - Loss: 1.6455
21
+ - Rewards/chosen: -0.1995
22
+ - Rewards/rejected: -0.2029
23
+ - Rewards/accuracies: 0.5050
24
+ - Rewards/margins: 0.0035
25
+ - Logps/rejected: -2.0293
26
+ - Logps/chosen: -1.9941
27
+ - Logits/rejected: -5.7383
28
+ - Logits/chosen: -6.1055
29
+ - Nll Loss: 1.5752
30
+ - Log Odds Ratio: -0.7037
31
+ - Log Odds Chosen: 0.0445
32
 
33
  ## Model description
34
 
 
60
 
61
  ### Training results
62
 
63
+ | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Nll Loss | Log Odds Ratio | Log Odds Chosen |
64
+ |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:--------:|:--------------:|:---------------:|
65
+ | 1.7595 | 0.53 | 100 | 1.6455 | -0.1995 | -0.2029 | 0.5050 | 0.0035 | -2.0293 | -1.9941 | -5.7383 | -6.1055 | 1.5752 | -0.7037 | 0.0445 |
66
 
67
 
68
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f8cf2320132870871e63e21ceb620887a6b8d5033bf804a5e20cbcf75ef23eaf
3
  size 543068816
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dec3dca22bc068e2265a0124bbf7b519447af5064cb190f22b9fc0229ee93556
3
  size 543068816
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6f0defbfdf61a624064796d97dd1faeef3d3578d0aff1e5f0368ba4f386c9a49
3
  size 5240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2950a7fef0158105fde8dcf848cc27c651bbc8376406da514b4e5fbba2c278ff
3
  size 5240