File size: 1,395 Bytes
5a18f17 2458fcb 5a18f17 1b83e83 5a18f17 1b83e83 5a18f17 1b83e83 5a18f17 b54a31c 8e3f0fb 5a18f17 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
---
base_model:
- happzy2633/qwen2.5-7b-ins-v3
- bunnycore/Qwen2.5-7B-Matrix
- bunnycore/Qwen2.5-7B-HyperMix
library_name: transformers
tags:
- mergekit
- merge
- reasoning
---
## Qwen 2.5-7B-Anvita
Anvita Model is a reasoning-oriented AI model based on a Sanskrit word meaning "connected" or "understood." "Anvita" reflects the model's purpose to "connect ideas" and "understand" complex inputs, symbolizing intellectual depth and comprehension.
Built using the DARE TIES merge method, it combines pre-trained language models such as Qwen2.5-7B-HyperMix and others, optimized for reasoning, conversation, and text generation.
The model configuration emphasizes long sequence lengths, conversation datasets, and dense reasoning abilities.
## Note:
If you want good reasoning power from this model, please use BF16 and XTC sampling.
### Configuration
The following YAML configuration was used to produce this model:
```yaml
slices:
models:
- model: bunnycore/Qwen2.5-7B-Matrix
parameters:
weight: [0.25, 0.35, 0.45, 0.35, 0.25]
density: [0.1, 0.25, 0.5, 0.25, 0.1]
- model: bunnycore/Qwen2.5-7B-HyperMix
- model: happzy2633/qwen2.5-7b-ins-v3
parameters:
weight: [0.55, 0.45, 0.35, 0.45, 0.55]
density: [0.1, 0.25, 0.5, 0.25, 0.1]
merge_method: dare_ties
base_model: bunnycore/Qwen2.5-7B-HyperMix
parameters:
int8_mask: true
dtype: bfloat16
```
|