nlee-208 commited on
Commit
86a6992
·
verified ·
1 Parent(s): a613ef2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -22
README.md CHANGED
@@ -15,21 +15,15 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  # Llama3.2-3B-MP-RM
17
 
18
- This model is a fine-tuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on the iqwiki-kor/MP-86k dataset.
19
 
20
- ## Model description
 
 
 
 
21
 
22
- More information needed
23
 
24
- ## Intended uses & limitations
25
-
26
- More information needed
27
-
28
- ## Training and evaluation data
29
-
30
- More information needed
31
-
32
- ## Training procedure
33
 
34
  ### Training hyperparameters
35
 
@@ -48,13 +42,3 @@ The following hyperparameters were used during training:
48
  - lr_scheduler_warmup_ratio: 0.1
49
  - num_epochs: 1
50
 
51
- ### Training results
52
-
53
-
54
-
55
- ### Framework versions
56
-
57
- - Transformers 4.43.4
58
- - Pytorch 2.4.1+cu124
59
- - Datasets 2.20.0
60
- - Tokenizers 0.19.1
 
15
 
16
  # Llama3.2-3B-MP-RM
17
 
18
+ This model is a fine-tuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on the [iqwiki-kor/MP-86k](https://huggingface.co/datasets/iqwiki-kor/MP-86k) dataset.
19
 
20
+ ## RewardBench Evaluation
21
+ | Model | Chat | Chat-Hard | Safety | Reasoning | Avg. |
22
+ |-----------------------------------------------------------------------------------------------------|---------:|-------:|------:|--------:|--------:|
23
+ | [iqwiki-kor/Llama3.2-3B-MP-RM](https://huggingface.co/iqwiki-kor/Llama3.2-3B-MP-RM/) |92.5| 81.8| 90.2| 95.5| 90.0|
24
+ | [RLHFlow/ArmoRM-Llama3-8B-v0.1](https://huggingface.co/RLHFlow/ArmoRM-Llama3-8B-v0.1) |96.9 |76.8 |90.5 |97.3 |90.4|
25
 
 
26
 
 
 
 
 
 
 
 
 
 
27
 
28
  ### Training hyperparameters
29
 
 
42
  - lr_scheduler_warmup_ratio: 0.1
43
  - num_epochs: 1
44