traversaal-2.5-Mistral-7B is trained via Direct Preference Optimization(DPO) from teknium/OpenHermes-2.5-Mistral-7B as its base model, with several optimizations in hyperparameters. teknium/OpenHermes-2.5-Mistral-7B is trained via Supervised Fine-Tuning (SFT) using LoRA, with the QWEN-72B model as its base-model. Note that we did not exploit any form of weight merge. For leaderboard submission, the trained weight is realigned for compatibility with Mistral-7b

Downloads last month: 11

Safetensors

Model size

7.24B params

Tensor type

FP16

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

traversaal-ai
/

traversaal-2.5-Mistral-7B

Dataset used to train traversaal-ai/traversaal-2.5-Mistral-7B