README.md · sometimesanotion/Lamarck-14B-v0.4-Qwenvergence at main

metadata

language:
  - en
license: apache-2.0
library_name: transformers
tags:
  - mergekit
  - merge
base_model:
  - arcee-ai/Virtuoso-Small
  - Qwen2.5-14B-Qwenvergence-model_stock
  - sometimesanotion/Qwen2.5-14B-Qwenvergence-model_stock
metrics:
  - accuracy
pipeline_tag: text-generation

Lamarck 14B v0.4 Qwenvergence: it's a big step up for Lamarck in terms of quality. All the same ingredients are involved as in previous releases of Lamarck; they are more effectively combined. This model features slightly improved reasoning from 0.3, but the multi-language and prose are greatly improved.

Merge Details

This model was initialized from model_stock, and refined from there. No fine-tuning, or use of models apart from those listed as the contents of Qwen2.5-14B-Qwenvergence-model_stock except for a very mild application of huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2.

Models Merged

Top influences: These ancestors are in the Qwenvergence model_stock, reinforced in later steps:

arcee-ai/Virtuoso-Small - A brand new model from Arcee, refined from the notable cross-architecture Llama-to-Qwen distillation arcee-ai/SuperNova-Medius. The first two layers are nearly exclusively from Virtuoso. It has proven to be a well-rounded performer, and contributes a noticeable boost to the model's prose quality.
CultriX/SeQwence-14B-EvolMerge - A top contender on reasoning benchmarks.
VAGOsolutions/SauerkrautLM-v2-14b-DPO - This model's influence is understated, but aids BBH and coding capability.
v000000/Qwen2.5-Lumen-14B - A leading influence for prose quality.

Prose added:

The prose quality has taken a leap, no doubt also to the way EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2, sthenno-com/miscii-14b-1028, oxyapi/oxy-1-small, and underwoods/medius-erebus-magnum-14b were applied.

Configuration

The following YAML configurations were used to initialize and finalize this model:

name:                Qwenvergence-model_stock
merge_method:        model_stock
base_model:          Qwen/Qwen2.5-14B
tokenizer_source:    base
parameters:
  int8_mask:         true
  normalize:         true
  rescale:           false
models:
  - model:           allura-org/TQ2.5-14B-Sugarquill-v1
  - model:           oxyapi/oxy-1-small
  - model:           sthenno-com/miscii-14b-1028
  - model:           underwoods/medius-erebus-magnum-14b
  - model:           EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2
  - model:           CultriX/SeQwence-14B-EvolMerge
  - model:           arcee-ai/Virtuoso-Small
  - model:           VAGOsolutions/SauerkrautLM-v2-14b-DPO
  - model:           v000000/Qwen2.5-Lumen-14B
dtype:               bfloat16
out_dtype:           bfloat16
---
# Experimental merge methods involving above models
---
name:                Lamarck-14B-v0.4-Qwenvergence
merge_method:        ties
base_model:          sometimesanotion/lamarck-14b-base
tokenizer_source:    base
parameters:         
  density:           1.00
  weight:            1.00
  int8_mask:         true
  normalize:         true
  rescale:           false
models:
  - model:           merges/Qwen2.5-14B-Qwenvergence-slerp
    parameters:
      weight:        1.00
      density:       1.00
  - model:           arcee-ai/Virtuoso-Small
    parameters:
      weight:        1.00
      density:       1.00