nvidia
/

Llama-3.1-Nemotron-70B-Reward

Model card Files Files and versions Community

zhilinw commited on Sep 28, 2024

Commit

8ac3d10

·

verified ·

1 Parent(s): 7d174e2

Update README.md

Files changed (1) hide show

README.md +1 -3

README.md CHANGED Viewed

@@ -26,7 +26,7 @@ By accessing this model, you are agreeing to the LLama 3.1 terms and conditions
 ## RewardBench Primary Dataset LeaderBoard
-Llama-3.1-Nemotron-70B-Reward performs best Overall on RewardBench as well as in Chat, Safety and Reasoning category.
  | Model  | Type of Data Used For Training |  Overall | Chat | Chat-Hard | Safety | Reasoning |
 |:-----------------------------|:----------------|:-----|:----------|:-------|:----------|:-----------------------|
@@ -57,8 +57,6 @@ On the other hand, when GPT-4 annotations are used as Ground-Truth, we trail sub
 | Skywork-Reward-Gemma-2-27B | Includes GPT4 Generated Data |  91.4  |  78.3  |  89.6  |  96.0  |  97.8  |  91.5 | 86.5|
-Last updated: 27 Sept 2024
 ## Usage:
 You can use the model with [NeMo Aligner](https://github.com/NVIDIA/NeMo-Aligner) following [SteerLM training user guide](https://docs.nvidia.com/nemo-framework/user-guide/latest/modelalignment/steerlm.html).

 ## RewardBench Primary Dataset LeaderBoard
+As of 27 Sept 2024, Llama-3.1-Nemotron-70B-Reward performs best Overall on RewardBench as well as in Chat, Safety and Reasoning category.
  | Model  | Type of Data Used For Training |  Overall | Chat | Chat-Hard | Safety | Reasoning |
 |:-----------------------------|:----------------|:-----|:----------|:-------|:----------|:-----------------------|
 | Skywork-Reward-Gemma-2-27B | Includes GPT4 Generated Data |  91.4  |  78.3  |  89.6  |  96.0  |  97.8  |  91.5 | 86.5|
 ## Usage:
 You can use the model with [NeMo Aligner](https://github.com/NVIDIA/NeMo-Aligner) following [SteerLM training user guide](https://docs.nvidia.com/nemo-framework/user-guide/latest/modelalignment/steerlm.html).