Text Generation
NeMo
English
nvidia
steerlm
llama2
reward model
zhilinw commited on
Commit
9ba1a9e
1 Parent(s): 67fdf19

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -26,9 +26,9 @@ Llama2-13B-SteerLM-RM is a 13 billion parameter language model (with context of
26
  Given a conversation with multiple turns between user and assistant, it rates the following attributes (between 0 and 4) for every assistant turn.
27
 
28
  1. **Quality**: Perceived goodness of response
29
- 2. **Toxicity**: Undesirable elements such as vulgar, harmful or potentially biased responses
30
- 3. **Humor**: Sense of humor within responses
31
- 4. **Creativity**: Willingness to generate non-conventional responses
32
  5. **Helpfulness**: Overall helpfulness of the response to the prompt.
33
  6. **Correctness**: Inclusion of all pertinent facts without errors.
34
  7. **Coherence**: Consistency and clarity of expression.
 
26
  Given a conversation with multiple turns between user and assistant, it rates the following attributes (between 0 and 4) for every assistant turn.
27
 
28
  1. **Quality**: Perceived goodness of response
29
+ 2. **Toxicity**: Undesirable elements such as vulgar, harmful or potentially biased response
30
+ 3. **Humor**: Sense of humor within response
31
+ 4. **Creativity**: Willingness to generate non-conventional response
32
  5. **Helpfulness**: Overall helpfulness of the response to the prompt.
33
  6. **Correctness**: Inclusion of all pertinent facts without errors.
34
  7. **Coherence**: Consistency and clarity of expression.