Model Description

This is an experiment to test merging 14 models using DARE TIES 🦙

We first merge 14 models to produce EmbeddedLLM/Mistral-7B-Merge-14-v0.3, which is then merged again with Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp using Gradient SLERP. The result is a model that performs quite well but may require further instruction fine-tuning.

Open LLM Leaderboard

Average 71.19
ARC 66.81
HellaSwag 86.15
MMLU 65.10
TruthfulQA 58.25
Winogrande 80.03
GSM8K 70.81

Chat Template

Either ChatML or Llama-2 chat template.

Merge Configuration

The merge config file for this model is here:

slices:
  - sources:
      - model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp
        layer_range: [0, 32]
      - model: EmbeddedLLM/Mistral-7B-Merge-14-v0.3
        layer_range: [0, 32]

merge_method: slerp
base_model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp

parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5 # fallback for rest of tensors
tokenizer_source: base
embed_slerp: true

dtype: bfloat16
Downloads last month
483
Safetensors
Model size
7.24B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for EmbeddedLLM/Mistral-7B-Merge-14-v0.4