Text Generation
NeMo
English
nvidia
steerlm
llama2
reward model
zhilinw commited on
Commit
03664b5
1 Parent(s): a8fd6a9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -24,7 +24,7 @@ The use of this model is governed by the [Llama 2 Community License Agreement](h
24
  ## Description:
25
  Llama2-13B-SteerLM-RM is a 13 billion parameter language model (with context of up to 4,096 tokens) used as the Attribute Prediction Model in training [Llama2-70B-SteerLM-Chat](https://huggingface.co/nvidia/Llama2-70B-SteerLM-Chat)
26
 
27
- Attribute Prediction Model is an multi-aspect Reward Model that rates model responses on various aspects that makes a response desirable instead of a singular score in a conventional Reward Model.
28
 
29
  Given a conversation with multiple turns between user and assistant, it rates the following attributes (between 0 and 4) for every assistant turn.
30
 
 
24
  ## Description:
25
  Llama2-13B-SteerLM-RM is a 13 billion parameter language model (with context of up to 4,096 tokens) used as the Attribute Prediction Model in training [Llama2-70B-SteerLM-Chat](https://huggingface.co/nvidia/Llama2-70B-SteerLM-Chat)
26
 
27
+ Attribute Prediction Model is a multi-aspect Reward Model that rates model responses on various aspects that makes a response desirable instead of a singular score in a conventional Reward Model.
28
 
29
  Given a conversation with multiple turns between user and assistant, it rates the following attributes (between 0 and 4) for every assistant turn.
30