ehzoah/exo-imdb-sft-model
Text Generation
•
Updated
•
13
SFT & Reward Models used in the experiments of the ICML 2024 paper "Towards Efficient Exact Optimization of Language Model Alignment"