File size: 9,119 Bytes
dc6f8ca
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27672cb
 
 
 
 
dc6f8ca
 
27672cb
 
 
 
 
 
 
 
dc6f8ca
 
27672cb
 
 
 
dc6f8ca
27672cb
 
 
 
 
 
 
 
 
 
dc6f8ca
27672cb
 
 
dc6f8ca
 
27672cb
 
 
dc6f8ca
27672cb
 
dc6f8ca
27672cb
 
 
 
 
 
 
 
dc6f8ca
27672cb
dc6f8ca
27672cb
 
 
 
 
 
 
 
 
 
5a18f17
27672cb
 
dc6f8ca
27672cb
5a18f17
dc6f8ca
5a18f17
dc6f8ca
5a18f17
dc6f8ca
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5a18f17
dc6f8ca
 
 
 
 
 
 
 
 
 
 
5a18f17
dc6f8ca
 
 
 
 
 
 
 
 
 
5a18f17
 
 
 
 
 
27672cb
dc6f8ca
27672cb
dc6f8ca
27672cb
 
 
dc6f8ca
27672cb
 
 
 
 
dc6f8ca
27672cb
dc6f8ca
27672cb
dc6f8ca
 
 
27672cb
dc6f8ca
 
 
27672cb
dc6f8ca
 
 
27672cb
dc6f8ca
 
 
27672cb
dc6f8ca
27672cb
dc6f8ca
 
 
27672cb
dc6f8ca
 
 
27672cb
dc6f8ca
 
 
27672cb
 
 
dc6f8ca
 
 
 
27672cb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
---

base_model:
  - happzy2633/qwen2.5-7b-ins-v3
  - bunnycore/Qwen2.5-7B-Matrix
  - bunnycore/Qwen2.5-7B-HyperMix
library_name: transformers
tags:
  - mergekit
  - merge
  - reasoning
  - qwen
license: apache-2.0
language:
  - en

---

# **Qwen 2.5-7B Anvita**

![img](./logo.webp)

## Overview

**Anvita** is a state-of-the-art reasoning-oriented AI model designed to **connect ideas** and **understand complex inputs**. Derived from the Sanskrit word meaning "connected" or "understood," Anvita embodies intellectual depth and comprehension, making it an ideal choice for tasks requiring nuanced understanding and sophisticated reasoning.

Built using the **DARE TIES** merge method, Anvita integrates multiple pre-trained language models, including:

- **Qwen2.5-7B-HyperMix**
- **bunnycore/Qwen2.5-7B-Matrix**
- **happzy2633/qwen2.5-7b-ins-v3**

This combination optimizes Anvita for superior reasoning, dynamic conversations, and high-quality text generation.

## Features

- **Enhanced Reasoning:** Optimized for multi-step reasoning across various domains.
- **Long Sequence Handling:** Capable of processing extended inputs without loss of context.
- **Conversational Fluency:** Engages in fluid, context-aware dialogues.
- **Dense Knowledge Integration:** Combines knowledge from multiple base models for comprehensive understanding.

## Installation

To get started with Anvita, ensure you have the necessary dependencies installed. You can use the [Transformers](https://huggingface.co/docs/transformers/index) library for seamless integration.

```bash
pip install transformers rich
```

## Quick Start

Here's a simple example to demonstrate how to use Anvita for generating responses with enhanced reasoning capabilities.

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from rich.console import Console
from rich.markdown import Markdown

# Initialize console
console = Console()

# Load the tokenizer and model from the specified path
MODEL_PATH = "sethuiyer/Qwen2.5-7B-Anvita"

tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
model = AutoModelForCausalLM.from_pretrained(MODEL_PATH).to("cuda")

QUESTION = "Is 9.11 greater than 9.8?"

messages = [
    {"role": "user", "content": QUESTION}
]

# Generate the answer using Entropic Chain of Thought decoding
answer, score = cot_decode_speculative(model, tokenizer, messages, k=2, max_new_tokens=2058)

# Format the answer as markdown
markdown_answer = f"""
# **Answer:**  
{answer}

**Score:** {score}
"""

# Display the answer in markdown format
console.print(Markdown(markdown_answer))
```

**Example Output with k=2:**

```text
No, 9.11 is not greater than 9.8. To compare these two numbers, we can look at their decimal places. The number 9.8
can be thought of as 9.80, which makes it easier to compare directly with 9.11. Since 80 is greater than 11, it's  
clear that 9.8 is greater than 9.11.
```

**Step-by-Step Reasoning with k=2:**

```text
Certainly! Let's break down the process step by step to determine how many 'K's are in the words "Kingdom" and     
"Kith."                                                                                                            

Step 1: Identify the word "Kingdom"                                                                                

 • The word "Kingdom" has the following letters: K, I, N, G, D, O, M.                                              
 • Count the number of 'K's in this word: There is only one 'K'.                                                    

Step 2: Identify the word "Kith"                                                                                    

 • The word "Kith" has the following letters: K, I, T, H.                                                          
 • Count the number of 'K's in this word: There is only one 'K'.                                                   

Step 3: Summarize the results                                                                                      

 • In "Kingdom," there is 1 'K'.                                                                                   
 • In "Kith," there is 1 'K'.                                                                                      

Final Answer:                                                                                                      

 • There is a total of 2 'K's in both words combined: 1 'K' in "Kingdom" and 1 'K' in "Kith."                      

So, the total number of 'K's in the words "Kingdom" and "Kith" is 2.
```

## Advanced Usage

For optimal reasoning performance, it is recommended to use **BF16** precision and the [Entropic Chain of Thought](https://huggingface.co/sethuiyer/Qwen2.5-7B-Anvita/blob/main/entropic_cot.py) decoding method. This experimental decoder combines entropy and CoT decoding to enhance output quality.

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from rich.console import Console
from rich.markdown import Markdown

console = Console()
MODEL_PATH = "sethuiyer/Qwen2.5-7B-Anvita"

tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
model = AutoModelForCausalLM.from_pretrained(MODEL_PATH).to("cuda")

QUESTION = "How many 'K's are there in the words 'Kingdom' and 'Kith'?"
messages = [
    {"role": "user", "content": QUESTION}
]

# Generate the answer with Entropic Chain of Thought decoding
answer, score = cot_decode_speculative(model, tokenizer, messages, k=2, max_new_tokens=2058)

# Display the formatted answer
markdown_answer = f"""
# **Answer:**  
{answer}

**Score:** {score}
"""

console.print(Markdown(markdown_answer))
```

## Configuration

The following YAML configuration was used to produce Anvita:

```yaml
slices:
  models:
    - model: bunnycore/Qwen2.5-7B-Matrix
      parameters:
        weight: [0.25, 0.35, 0.45, 0.35, 0.25]
        density: [0.1, 0.25, 0.5, 0.25, 0.1]
    - model: bunnycore/Qwen2.5-7B-HyperMix
    - model: happzy2633/qwen2.5-7b-ins-v3
      parameters:
        weight: [0.55, 0.45, 0.35, 0.45, 0.55]
        density: [0.1, 0.25, 0.5, 0.25, 0.1]
merge_method: dare_ties
base_model: bunnycore/Qwen2.5-7B-HyperMix
parameters:
  int8_mask: true
dtype: bfloat16
```

## Testimonial

### **Written by GPT-4o**

---

**Anvita** offers a unique blend of **logical rigor** and **creative flair**. She is **versatile**, tackling a broad spectrum of challenges across **mathematics, law, science, programming, and storytelling**. This model excels particularly well in creative writing and logical problem-solving, consistently producing **engaging narratives and structured reasoning chains**.

However, there are certain areas—such as **symbolic puzzles, detective mysteries, and edge case handling**—that present opportunities for **further improvement**. Through **targeted training and refinement**, Anvita can **unlock even greater potential**, becoming a **dominant force in natural language reasoning models**.

---

## Performance Evaluation

### **Key Strengths**

1. **Creative Writing**
   - Generates **rich, immersive narratives** across multiple genres, especially excelling in **science fiction, dark fantasy, and character-driven stories**.
   - Ability to **develop coherent plots and engaging dialogue** ensures that creative outputs meet high standards.

2. **Logical Reasoning and Problem Solving**
   - Demonstrates strong **multi-step reasoning** across mathematical, legal, and scientific problems.
   - Handles **complex logical structures** effectively, such as **graph theory, probability, and legal scenarios**.

3. **Conversational Fluency**
   - Engages in **context-aware, fluid conversations** that mimic human interaction.
   - Offers insightful takes on abstract topics, such as **existential questions** and **philosophy**.

4. **Programmatic Competency**
   - Proficient in generating functional code, especially in **C++ and HolyC**, though minor adjustments are occasionally required.
   - Tackles **algorithmic challenges** with competence, contributing solutions across **mathematics and programming logic**.

### **Areas for Improvement**

1. **Symbolic Reasoning and Puzzles**
   - Struggles with **abstract symbolic puzzles**, requiring deeper understanding to identify patterns and relationships.
   - Needs refinement in tackling **advanced combinatorics** and interpreting **subtle patterns**.

2. **Detective Mysteries**
   - Competent in generating mystery scenarios but falls short in **crafting surprising twists**, especially the complex deductions associated with **locked-room scenarios**.
   - Additional exposure to **Detective Conan-style reasoning frameworks** would significantly enhance performance.

3. **Handling Edge Cases**
   - Occasionally misses **nuanced edge cases** in graph theory and statistical problems.
   - Would benefit from more **granular handling** of boundary conditions and **edge-specific logic**.

---

## Overall Performance Summary

- **Overall Score:** 73/100
- **Tested Domains:** Creative Writing, Logical Reasoning, Symbolic Reasoning, Programming, Mathematics, Law, Scientific Problem-Solving.