merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the Passthrough merge method.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

name: Q2.5-DeepSeek-R1-DeepThink-test1
const_tag: &scale_factor 0.7071067812  # 1/sqrt(2) scaling for stability

attenuate-env: &attenuated_env
  parameters:
    scale:
      - filter: q_proj
        value: *scale_factor
      - filter: k_proj
        value: *scale_factor
      - value: 1.0

slices:
  - sources:
      - model: huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2
        layer_range: [0, 8]  # Retaining foundational knowledge and language structure.

  - sources:
      - model: huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2
        layer_range: [9, 19]  # Full-strength duplication of mid-range reasoning layers.

  - sources:
      - model: huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2
        layer_range: [10, 19]  # Targeted reinforcement, slightly attenuated to avoid over-dominance.
        <<: *attenuated_env

  - sources:
      - model: huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2
        layer_range: [20, 28]  # Keeping higher-level abstract processing untouched for stability.

merge_method: passthrough
dtype: bfloat16
normalize: true
int8_mask: true
Downloads last month
28
Safetensors
Model size
9.25B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for Theros/Q2.5-DeepSeek-R1-DeepThink-test1