Transformers
Safetensors
English
deberta-v2
reward_model
reward-model
RLHF
evaluation
llm
instruction
reranking
Inference Endpoints
Better-PairRM / README.md
maywell's picture
initial commit
7fc0cce verified
metadata
license: apache-2.0