javirandor's picture
Create README.md
52f7538 verified

Poisoned Reward Model

This reward model was used to align this generation model for the trojan detection competition co-located at SaTML 2024. For more information, visit the official competition website