--- license: apache-2.0 language: - en tags: - merge base_model: - Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp - EmbeddedLLM/Mistral-7B-Merge-14-v0.3 --- # Model Description This is an experiment to test merging 14 models using DARE TIES 🦙 We first merge 14 models to produce [EmbeddedLLM/Mistral-7B-Merge-14-v0.3](https://huggingface.co/EmbeddedLLM/Mistral-7B-Merge-14-v0.3), which is then merged again with [Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp](https://huggingface.co/Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp) using Gradient SLERP. The result is a model that performs quite well but may require further instruction fine-tuning. ## Open LLM Leaderboard | Average | 71.19 | |------------|-------| | ARC | 66.81 | | HellaSwag | 86.15 | | MMLU | 65.10 | | TruthfulQA | 58.25 | | Winogrande | 80.03 | | GSM8K | 70.81 | ## Chat Template Either ChatML or Llama-2 chat template. ## Merge Configuration The merge config file for this model is here: ```yaml slices: - sources: - model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp layer_range: [0, 32] - model: EmbeddedLLM/Mistral-7B-Merge-14-v0.3 layer_range: [0, 32] merge_method: slerp base_model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp parameters: t: - filter: self_attn value: [0, 0.5, 0.3, 0.7, 1] - filter: mlp value: [1, 0.5, 0.7, 0.3, 0] - value: 0.5 # fallback for rest of tensors tokenizer_source: base embed_slerp: true dtype: bfloat16 ```