teknium/OpenHermes-2.5-Mistral-7B
Text Generation
•
Updated
•
42.4k
•
818
A collection of chat models to explore the differences between three alignment techniques: DPO, IPO, and KTO.
Note The chat model we optimized with DPO, IPO, and KTO.
Note The AI feedback dataset we used to fine-tune OpenHermes-2.5 with DPO, IPO, and KTO.