Air-Striker-Mixtral-8x7B-Instruct-ZLoss
Experimental model, trained using config and Transformers/Axolotl forks provided by Doctor-Shotgun
Model was fine-tuned from Mixtral-8x7B-v0.1 with airoboros-3.2 dataset, for 4 epochs, ChatML prompt format at 8K context length.
Additionally, model was then merged with Mixtral-8x7B-Instruct-v0.1:
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the linear merge method.
Models Merged
The following models were included in the merge:
- mistralai/Mixtral-8x7B-Instruct-v0.1
- LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss
Configuration
The following YAML configuration was used to produce this model:
models:
- model: mistralai/Mixtral-8x7B-Instruct-v0.1
parameters:
weight: 0.5
- model: LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss
parameters:
weight: 0.5
merge_method: linear
dtype: bfloat16
- Downloads last month
- 5