Model Description
This is an experiment to test merging 14 models using DARE TIES 🦙
We first merge 14 models to produce EmbeddedLLM/Mistral-7B-Merge-14-v0.3, which is then merged again with Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp using Gradient SLERP. The result is a model that performs quite well but may require further instruction fine-tuning.
Open LLM Leaderboard
Average | 71.19 |
---|---|
ARC | 66.81 |
HellaSwag | 86.15 |
MMLU | 65.10 |
TruthfulQA | 58.25 |
Winogrande | 80.03 |
GSM8K | 70.81 |
Chat Template
Either ChatML or Llama-2 chat template.
Merge Configuration
The merge config file for this model is here:
slices:
- sources:
- model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp
layer_range: [0, 32]
- model: EmbeddedLLM/Mistral-7B-Merge-14-v0.3
layer_range: [0, 32]
merge_method: slerp
base_model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5 # fallback for rest of tensors
tokenizer_source: base
embed_slerp: true
dtype: bfloat16
- Downloads last month
- 483
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for EmbeddedLLM/Mistral-7B-Merge-14-v0.4
Merge model
this model