Use ChatML or MistralNemo format.

Conclusion: These types of merge methods tend to work better when at least 1 model has a much higher weight then the rest

After further testing this is the best Nemo model I have ever used

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: mistral-nemo-gutenberg-12B-v4
    parameters:
      weight: 0.2
  - model: Violet_Twilight-v0.2
    parameters:
      weight: 0.3
  - model: Lyra-Gutenberg-mistral-nemo-12B
    parameters:
      weight: 0.5
  - model: Grey-12b
    parameters:
      weight: 0.2
base_model: Mistral-Nemo-Base-2407
parameters:
  density: 0.5
  epsilon: 0.1
  lambda: 1.1
  normalize: false
  int8_mask: true
  rescale: true
merge_method: della_linear
tokenizer:
  source: union
dtype: bfloat16
Downloads last month
8
Safetensors
Model size
12.2B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ohyeah1/Violet-Lyra-Gutenberg

Quantizations
1 model