Air-Striker-Mixtral-8x7B-Instruct-ZLoss

Experimental model, trained using config and Transformers/Axolotl forks provided by Doctor-Shotgun

Model was fine-tuned from Mixtral-8x7B-v0.1 with airoboros-3.2 dataset, for 4 epochs, ChatML prompt format at 8K context length.

Additionally, model was then merged with Mixtral-8x7B-Instruct-v0.1:


This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the linear merge method.

Models Merged

The following models were included in the merge:

  • mistralai/Mixtral-8x7B-Instruct-v0.1
  • LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: mistralai/Mixtral-8x7B-Instruct-v0.1
    parameters:
      weight: 0.5
  - model: LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss
    parameters:
      weight: 0.5
merge_method: linear
dtype: bfloat16
Downloads last month
10
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Dataset used to train LoneStriker/Air-Striker-Mixtral-8x7B-Instruct-ZLoss-3.75bpw-h6-exl2