PERL: Parameter Efficient Reinforcement Learning from Human Feedback Paper • 2403.10704 • Published Mar 15 • 56
RewardBench: Evaluating Reward Models for Language Modeling Paper • 2403.13787 • Published Mar 20 • 19
Direct Preference Optimization: Your Language Model is Secretly a Reward Model Paper • 2305.18290 • Published May 29, 2023 • 45
DRLC: Reinforcement Learning with Dense Rewards from LLM Critic Paper • 2401.07382 • Published Jan 14 • 2