merge
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the della_linear merge method using CultriX/Qwen2.5-14B-Wernickev3 as a base.
Models Merged
The following models were included in the merge:
- allknowingroger/QwenSlerp6-14B
- qingy2019/Qwen2.5-Math-14B-Instruct
- CultriX/Qwen2.5-14B-Broca
- sometimesanotion/Qwen2.5-14B-Vimarckoso
- djuna/Q2.5-Veltha-14B-0.5
- CultriX/Qwenfinity-2.5-14B
- CultriX/SeQwence-14Bv1
Configuration
The following YAML configuration was used to produce this model:
# Refined configuration using della_linear for optimized logical and multitask reasoning.
base_model: CultriX/Qwen2.5-14B-Wernickev3
merge_method: della_linear
parameters:
epsilon: 0.012 # Ultra-fine parameter scaling for precision.
lambda: 1.4 # Emphasis on significant model contributions.
normalize: true # Ensures balanced parameter integration.
smoothing_factor: 0.1 # Precisely tuned for smooth task-specific blending.
gradient_clipping:
CultriX/Qwen2.5-14B-Wernickev3: 0.86 # Stabilized multitask performance.
CultriX/Qwenfinity-2.5-14B: 0.83 # Multitask integration refinement.
djuna/Q2.5-Veltha-14B-0.5: 0.91 # Strengthened advanced reasoning.
CultriX/Qwen2.5-14B-Broca: 0.85 # Logical and contextual reasoning stability.
qingy2019/Qwen2.5-Math-14B-Instruct: 0.93 # Mathematical reasoning prioritization.
CultriX/SeQwence-14Bv1: 0.88 # Generalist multitask contributions.
sometimesanotion/Qwen2.5-14B-Vimarckoso: 0.89 # Multi-step reasoning enhancement.
allknowingroger/QwenSlerp6-14B: 0.87 # Contextual reasoning integration.
models:
- model: CultriX/Qwen2.5-14B-Wernickev3
parameters:
weight: 0.26 # Core backbone for multitask reasoning.
density: 0.7 # Prioritizes critical reasoning parameters.
- model: CultriX/Qwenfinity-2.5-14B
parameters:
weight: 0.23 # Comprehensive multitask contributor.
density: 0.65
- model: djuna/Q2.5-Veltha-14B-0.5
parameters:
weight: 0.22 # Advanced reasoning support for GPQA and MUSR.
density: 0.72
- model: CultriX/Qwen2.5-14B-Broca
parameters:
weight: 0.15 # Logical reasoning and factual QA enhancements.
density: 0.65
- model: qingy2019/Qwen2.5-Math-14B-Instruct
parameters:
weight: 0.18 # Mathematical reasoning priority.
density: 0.73
- model: CultriX/SeQwence-14Bv1
parameters:
weight: 0.14 # Generalist multitask backbone.
density: 0.63
- model: sometimesanotion/Qwen2.5-14B-Vimarckoso
parameters:
weight: 0.12 # Multi-step reasoning tasks contributor.
density: 0.6
- model: allknowingroger/QwenSlerp6-14B
parameters:
weight: 0.1 # Contextual reasoning improvements.
density: 0.62
adaptive_merge_parameters:
task_weights:
tinyArc: 1.6 # Logical reasoning improvements.
tinyHellaswag: 1.5 # Contextual consistency.
tinyMMLU: 1.65 # Domain knowledge enhancement.
tinyTruthfulQA: 1.9 # Accurate factual reasoning.
tinyTruthfulQA_mc1: 1.7 # Multiple-choice reasoning focus.
tinyWinogrande: 1.75 # Advanced reasoning boost.
IFEval: 1.9 # Instruction-following tasks prioritized.
BBH: 1.7 # Complex reasoning support.
MATH: 2.1 # Mathematical excellence emphasized.
GPQA: 1.8 # Graduate-level QA enhanced.
MUSR: 1.9 # Multi-step reasoning strengthened.
MMLU-PRO: 1.8 # Domain multitask performance maximized.
tokenizer_source: CultriX/Qwen2.5-14B-Wernickev3 # Tokenizer aligned with backbone.
- Downloads last month
- 4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for CultriX/Qwen2.5-14B-Brocav4
Merge model
this model