Update README.md
Browse files
README.md
CHANGED
@@ -22,7 +22,7 @@ We evaluate GRM 2B on the [reward model benchmark](https://huggingface.co/spaces
|
|
22 |
|
23 |
| Model | Average | Chat | Chat Hard | Safety | Reasoning |
|
24 |
|:-------------------------:|:-------------:|:---------:|:---------:|:--------:|:-----------:|
|
25 |
-
| [**Ray2333/GRM-Gemma-2B-sftreg**](https://huggingface.co/Ray2333/GRM-Gemma-2B-sftreg)(Ours, 2B) | 75.
|
26 |
| berkeley-nest/Starling-RM-7B-alpha (7B) | 74.6 | 98 | 43.4 | 88.6 | 74.6 |
|
27 |
| **Ray2333/Gemma-2B-rewardmodel-baseline**(Ours, 2B) | 73.7 | 94.1 | 46.1 | 79.6 | 75.0 |
|
28 |
| stabilityai/stablelm-zephyr-3b (3B) | 73.1 | 86.3 | 60.1 | 70.3 | 75.7 |
|
|
|
22 |
|
23 |
| Model | Average | Chat | Chat Hard | Safety | Reasoning |
|
24 |
|:-------------------------:|:-------------:|:---------:|:---------:|:--------:|:-----------:|
|
25 |
+
| [**Ray2333/GRM-Gemma-2B-sftreg**](https://huggingface.co/Ray2333/GRM-Gemma-2B-sftreg)(Ours, 2B) | 75.3 | 95.5 | 48.7 | 80.0 | 76.8 |
|
26 |
| berkeley-nest/Starling-RM-7B-alpha (7B) | 74.6 | 98 | 43.4 | 88.6 | 74.6 |
|
27 |
| **Ray2333/Gemma-2B-rewardmodel-baseline**(Ours, 2B) | 73.7 | 94.1 | 46.1 | 79.6 | 75.0 |
|
28 |
| stabilityai/stablelm-zephyr-3b (3B) | 73.1 | 86.3 | 60.1 | 70.3 | 75.7 |
|