Haoxiang-Wang commited on
Commit
9812423
1 Parent(s): 7a8c806

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -25,11 +25,12 @@ license: llama3
25
 
26
  | Model | Base Model | Method | Score | Chat | Chat Hard | Safety | Reasoning | Prior Sets (0.5 weight) |
27
  |:--------------------------------------------------------------------------------|:-----------------------------------------------------------------------|:-----:|:-----|:----------|:-------|:----------|:-----------------------|:------------------------|
28
- | ArmoRM-Llama3-8B-v0.1 | Llama-3 8B | ArmoRM + MoE | **88.97** | 96.9 | **76.8** | **92.2** | **97.3** | 74.3 |
29
- | Cohere May 2024 | Unknown | Unknown | 88.25 | 96.4 | 71.3 | **92.7** | **97.7** | **78.2** |
30
- | GPT-4 Turbo (0125 version) | GPT-4 Turbo | LLM-as-a-Judge | 84.25 | 95.3 | 74.3 | 87.2 | 86.9 | 70.9 |
31
- | [FsfairX-LLaMA3-RM-v0.1](https://huggingface.co/sfairXC/FsfairX-LLaMA3-RM-v0.1) | Llama-3 8B | Bradley-Terry | 83.61 | **99.4** | 65.1 | 87.8 | 86.4 | 74.9 |
32
- | [Starling-RM-34B](https://huggingface.co/Nexusflow/Starling-RM-34B) | Yi-34B | Bradley-Terry | 81.44 | 96.9 | 57.2 | 88.2 | 88.5 | 71.4 |
 
33
 
34
  ## Demo Code
35
  ```python
 
25
 
26
  | Model | Base Model | Method | Score | Chat | Chat Hard | Safety | Reasoning | Prior Sets (0.5 weight) |
27
  |:--------------------------------------------------------------------------------|:-----------------------------------------------------------------------|:-----:|:-----|:----------|:-------|:----------|:-----------------------|:------------------------|
28
+ | ArmoRM-Llama3-8B-v0.1 | Llama-3 8B | ArmoRM + MoE | **89.0** | 96.9 | **76.8** | **92.2** | **97.3** | 74.3 |
29
+ | Cohere May 2024 | Unknown | Unknown | 88.3 | 96.4 | 71.3 | **92.7** | **97.7** | **78.2** |
30
+ | [pair-preference-model](https://huggingface.co/RLHFlow/pair-preference-model-LLaMA3-8B)| Llama-3 8B | [SliC-HF](https://arxiv.org/abs/2305.10425) | 85.7 | 98.3 | 65.8 | 89.7 | 94.7 | 74.6 |
31
+ | GPT-4 Turbo (0125 version) | GPT-4 Turbo | LLM-as-a-Judge | 84.3 | 95.3 | 74.3 | 87.2 | 86.9 | 70.9 |
32
+ | [FsfairX-LLaMA3-RM-v0.1](https://huggingface.co/sfairXC/FsfairX-LLaMA3-RM-v0.1) | Llama-3 8B | Bradley-Terry | 83.6 | **99.4** | 65.1 | 87.8 | 86.4 | 74.9 |
33
+ | [Starling-RM-34B](https://huggingface.co/Nexusflow/Starling-RM-34B) | Yi-34B | Bradley-Terry | 81.4 | 96.9 | 57.2 | 88.2 | 88.5 | 71.4 |
34
 
35
  ## Demo Code
36
  ```python