metadata
base_model:
- happzy2633/qwen2.5-7b-ins-v3
- bunnycore/Qwen2.5-7B-Matrix
- bunnycore/Qwen2.5-7B-HyperMix
library_name: transformers
tags:
- mergekit
- merge
- reasoning
Qwen 2.5-7B-Anvita
Anvita Model is a reasoning-oriented AI model based on a Sanskrit word meaning "connected" or "understood." "Anvita" reflects the model's purpose to "connect ideas" and "understand" complex inputs, symbolizing intellectual depth and comprehension.
Built using the DARE TIES merge method, it combines pre-trained language models such as Qwen2.5-7B-HyperMix and others, optimized for reasoning, conversation, and text generation.
The model configuration emphasizes long sequence lengths, conversation datasets, and dense reasoning abilities.
Note:
If you want good reasoning power from this model, please use BF16 and XTC sampling.
Configuration
The following YAML configuration was used to produce this model:
slices:
models:
- model: bunnycore/Qwen2.5-7B-Matrix
parameters:
weight: [0.25, 0.35, 0.45, 0.35, 0.25]
density: [0.1, 0.25, 0.5, 0.25, 0.1]
- model: bunnycore/Qwen2.5-7B-HyperMix
- model: happzy2633/qwen2.5-7b-ins-v3
parameters:
weight: [0.55, 0.45, 0.35, 0.45, 0.55]
density: [0.1, 0.25, 0.5, 0.25, 0.1]
merge_method: dare_ties
base_model: bunnycore/Qwen2.5-7B-HyperMix
parameters:
int8_mask: true
dtype: bfloat16