sauc-abadal-lloret
/

gpt-j-6b-ALT-Quark-tldr

Model card Files Files and versions Community

sauc-abadal-lloret commited on Sep 25

Commit

36b0b86

•

1 Parent(s): 015a166

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -23,9 +23,9 @@ For extensive coverage on Quark, please refer to their paper.
 The reward model used for scoring the generations can be found in [here](https://huggingface.co/CarperAI/openai_summarize_tldr_rm_checkpoint). We used K = 5 quantile tokens, which were newly added to the tokenizer:
 ```python
-{'_QUANTILE_0_', '_QUANTILE_1_', '_QUANTILE_2_', '_QUANTILE_3_', '_QUANTILE_4_'}
 ```
-Thus, at inference time, the expected aligned behavior can be attained by conditioning the input on *'_QUANTILE_TOKEN_0_'*.
 **Related Models:** [ALT-RM](https://huggingface.co/sauc-abadal-lloret/gpt-j-6b-ALT-RM-tldr).

 The reward model used for scoring the generations can be found in [here](https://huggingface.co/CarperAI/openai_summarize_tldr_rm_checkpoint). We used K = 5 quantile tokens, which were newly added to the tokenizer:
 ```python
+{'_QUANTILE_TOKEN_0_', '_QUANTILE_TOKEN_1_', '_QUANTILE_TOKEN_2_', '_QUANTILE_TOKEN_3_', '_QUANTILE_TOKEN_4_'}
 ```
+Thus, at inference time, the expected aligned behavior can be attained by conditioning the input on `_QUANTILE_TOKEN_0_`.
 **Related Models:** [ALT-RM](https://huggingface.co/sauc-abadal-lloret/gpt-j-6b-ALT-RM-tldr).