Safetensors
English
gptj
sauc-abadal-lloret commited on
Commit
36b0b86
1 Parent(s): 015a166

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -23,9 +23,9 @@ For extensive coverage on Quark, please refer to their paper.
23
 
24
  The reward model used for scoring the generations can be found in [here](https://huggingface.co/CarperAI/openai_summarize_tldr_rm_checkpoint). We used K = 5 quantile tokens, which were newly added to the tokenizer:
25
  ```python
26
- {'_QUANTILE_0_', '_QUANTILE_1_', '_QUANTILE_2_', '_QUANTILE_3_', '_QUANTILE_4_'}
27
  ```
28
- Thus, at inference time, the expected aligned behavior can be attained by conditioning the input on *'_QUANTILE_TOKEN_0_'*.
29
 
30
  **Related Models:** [ALT-RM](https://huggingface.co/sauc-abadal-lloret/gpt-j-6b-ALT-RM-tldr).
31
 
 
23
 
24
  The reward model used for scoring the generations can be found in [here](https://huggingface.co/CarperAI/openai_summarize_tldr_rm_checkpoint). We used K = 5 quantile tokens, which were newly added to the tokenizer:
25
  ```python
26
+ {'_QUANTILE_TOKEN_0_', '_QUANTILE_TOKEN_1_', '_QUANTILE_TOKEN_2_', '_QUANTILE_TOKEN_3_', '_QUANTILE_TOKEN_4_'}
27
  ```
28
+ Thus, at inference time, the expected aligned behavior can be attained by conditioning the input on `_QUANTILE_TOKEN_0_`.
29
 
30
  **Related Models:** [ALT-RM](https://huggingface.co/sauc-abadal-lloret/gpt-j-6b-ALT-RM-tldr).
31