llm-blender
/

PairRM

Text Generation

Model card Files Files and versions Community

yuchenlin commited on Nov 13, 2023

Commit

94512ba

·

1 Parent(s): 345b1ee

Update README.md

Files changed (1) hide show

README.md +12 -9

README.md CHANGED Viewed

@@ -28,15 +28,6 @@ Inspired by [DeBERTa Reward Model Series](https://huggingface.co/OpenAssistant/r
 - Paper: [https://arxiv.org/abs/2306.02561](https://arxiv.org/abs/2306.02561)
 - Space Demo: [https://huggingface.co/spaces/llm-blender/LLM-Blender](https://huggingface.co/spaces/llm-blender/LLM-Blender)
-## Statistics
-### Context length
-|  PairRanker type  | Source max length | Candidate max length | Total max length |
-|:-----------------:|:-----------------:|----------------------|------------------|
-| [pair-ranker](https://huggingface.co/llm-blender/pair-ranker)               | 128               | 128                  | 384              |
-| [PairRM](https://huggingface.co/llm-blender/pair-reward-model/) (This model) | 1224              | 412                  | 2048             |
 ## Usage Example
 ### Installation
@@ -141,6 +132,18 @@ With a `blender.compare()` function, you can easily apply PairRM to poopular RLH
 Learn more in our LLM-Blender Github [README.md](https://github.com/yuchenlin/LLM-Blender#rank-and-fusion)
 ### Performance
 PairRM has been trained on various high-quality and large-scale dataset with human preference annotations and exhibits great correlation with human preferences
 with an extremly small model size (0.4B), approching the performance of GPT-4.

 - Paper: [https://arxiv.org/abs/2306.02561](https://arxiv.org/abs/2306.02561)
 - Space Demo: [https://huggingface.co/spaces/llm-blender/LLM-Blender](https://huggingface.co/spaces/llm-blender/LLM-Blender)
 ## Usage Example
 ### Installation
 Learn more in our LLM-Blender Github [README.md](https://github.com/yuchenlin/LLM-Blender#rank-and-fusion)
+## Statistics
+### Context length
+|  PairRanker type  | Source max length | Candidate max length | Total max length |
+|:-----------------:|:-----------------:|----------------------|------------------|
+| [pair-ranker](https://huggingface.co/llm-blender/pair-ranker)               | 128               | 128                  | 384              |
+| [PairRM](https://huggingface.co/llm-blender/pair-reward-model/) (This model) | 1224              | 412                  | 2048             |
 ### Performance
 PairRM has been trained on various high-quality and large-scale dataset with human preference annotations and exhibits great correlation with human preferences
 with an extremly small model size (0.4B), approching the performance of GPT-4.