File size: 5,711 Bytes
2ae5871
 
 
 
 
 
 
 
4bbd317
2ae5871
 
 
36c5712
 
2ae5871
36c5712
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2ae5871
 
 
 
 
 
 
 
 
36c5712
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
---
base_model: unsloth/qwen2.5-7b-instruct-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- trl
- sft
license: apache-2.0
language:
- en
datasets:
- kenhktsui/longtalk-cot-v0.1
---
# kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged

With 1 epoch of SFT, it boosts processbench GSM-8k subset by 26%. However, it results in worse result in GSM-8k. 
It seems to be either 1) the metrics are not measuring correctly due to parsing or 2) sign of overfitting answer, as it is not able to critque reasoning step.

||[kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged](https://huggingface.co/kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged)|[Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)|
|---|---|---|
|ProcessBench (GSM-8k)|**46.0**|36.5 (reported)|
|GSM-8k|78.54|**84.23**|

This needs to be investigated further.
I have done inspection of how **model behavior changed** after SFT. As expected, the finetuned model thinks much longer, and often solve a problem with multiple approaches instead of one.

### Quick Test 1: Is 9.11 greater than 9.9?
<details> 
  <summary> response from kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf </summary>
No, 9.11 is not greater than 9.9. Let's think about this step by step.

First, it helps to understand that both numbers are decimal numbers, specifically they have two digits after the decimal point. So, 9.11 and 9.9 can be compared directly based on their
values in each place value: whole number part and the tenths and hundredths parts.

Let's break them down:

- For 9.11:
  - The whole number part is 9.
  - The tenths place is 1 (which means one tenth).
  - The hundredths place is also 1 (one hundredth).

- For 9.9:
  - The whole number part is also 9.
  - But here, the tenths place is 9 (nine tenths), and there's no hundredths digit after that; it's considered to have zero in the hundredths place because decimal numbers with fewer digits
are filled with zeros.

So, comparing them:

- Both have the same whole number part: 9.
- For the tenth place:
  - 1 vs. 9.

Since 9 is greater than 1, this already tells us that 9.9 is larger than 9.11. But to be thorough, let's look at the hundredths places as well:

- In 9.11: hundredths = 1
- In 9.9: hundredths = 0

Even though there are no digits in the hundredths place for 9.9, it's still considered zero because of the way decimals work.

Therefore, overall, 9.11 < 9.9.

Wait a minute, but I should make sure that I'm not missing anything here. Sometimes, people might think about adding or subtracting numbers differently, but in this case, since both numbers
have the same whole number part and one has a larger tenths digit, it's clear which is larger.

Another way to look at it is to consider them as percentages:

- 9.11 = 911%
- 9.9 = 990%

But that might be confusing; in percentage terms, 990% is still greater than 911%.

Alternatively, if we multiply both numbers by 100 to eliminate the decimal point:

- 9.11 * 100 = 911
- 9.9 * 100 = 990

Again, 990 is greater than 911.

So, no matter how we look at it, 9.9 is larger than 9.11.

I think that covers the comparison thoroughly. There doesn't seem to be any trick or special case here; it's a straightforward decimal comparison once you understand place values.

**Final Answer**

\[ \boxed{9.11 < 9.9} \]
</details>

<details>
  <summary> response from response from Qwen/Qwen2.5-7B-Instruct-GGUF </summary>
No, 9.11 is not greater than 9.9. When comparing decimal numbers, you can compare them digit by digit from left to right. In this case:

- The whole number part of both numbers is the same (9).
- Comparing the tenths place, we see that 1 in 9.11 is larger than 9 in 9.9.
- Therefore, 9.11 > 9.9.

So, 9.11 is greater than 9.9.
</details>


### Quick Test 2: How many letters R in the word strawberry?
<details>
<summary> response from kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf  </summary>
Let's see. The word is "strawberry." I need to count how many 'R' letters are there in this word.

First, let me write down the word again: s-t-r-a-w-b-e-r-r-y.

Now, I'll go through each letter and check if it's an 'R'.

Starting with the first letter: S - that's not R.

Second letter: T - also not R.

Third letter: R - yes, that's an 'R'. So, one so far.

Fourth letter: A - nope.

Fifth letter: W - still not R.

Sixth letter: B - not R either.

Seventh letter: E - no.

Eighth letter: R - another 'R'! That makes two now.

Ninth letter: B - no R here.

Tenth letter: E - no R.

Eleventh letter: R - yet again, an 'R'. Now it's three.

Twelfth letter: Y - not R.

So, counting all the way through, I've found three 'R' letters in "strawberry."

Wait a minute, let me double-check. Maybe there are more 'R's that I missed.

Looking at "s-t-r-a-w-b-e-r-r-y." Yes, I did find three: 3rd, 8th, and 11th positions.

I think that's it. There don't seem to be any other 'R's in the word.

Therefore, there are 3 letters R in the word "strawberry."

**Final Answer**

\[ \boxed{3} \]
</details>
<details>
<summary> response from Qwen/Qwen2.5-7B-Instruct-GGUF </summary>
The letter "R" appears 1 time in the word "strawberry".
</details>
  
### Try out in Ollama!
```
ollama run hf.co/kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf
```

# Uploaded  model

- **Developed by:** kenhktsui
- **License:** apache-2.0
- **Finetuned from model :** unsloth/qwen2.5-7b-instruct-bnb-4bit

This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)