NicholasCorrado/zephyr-7b-uf-rlced-conifer-group-dpo-2e-alr-0.01-1e Text Generation • Updated Sep 11 • 3
SongTonyLi/gemma-2b-it-SFT-D1_chosen-then-DPO-D2a-distilabel-math-preference Text Generation • Updated Sep 12 • 3
LBK95/Llama-2-7b-hf-DPO-LookAhead3_FullEval_TTree1.4_TLoop0.7_TEval0.2_Filter0.2_V1.0 Updated Sep 12 • 2
SongTonyLi/gemma-2b-it-SFT-D1_chosen-then-DPO-D2a-HuggingFaceH4-ultrafeedback_binarized-Xlarge Text Generation • Updated Sep 13 • 11