Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,151 @@ tags:
|
|
10 |
license: apache-2.0
|
11 |
language:
|
12 |
- en
|
|
|
|
|
13 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
|
15 |
# Uploaded model
|
16 |
|
@@ -20,4 +164,4 @@ language:
|
|
20 |
|
21 |
This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
|
22 |
|
23 |
-
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|
|
|
10 |
license: apache-2.0
|
11 |
language:
|
12 |
- en
|
13 |
+
datasets:
|
14 |
+
- kenhktsui/longtalk-cot-v0.1
|
15 |
---
|
16 |
+
# kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged
|
17 |
+
|
18 |
+
With 1 epoch of SFT, it boosts processbench GSM-8k subset by 26%. However, it results in worse result in GSM-8k.
|
19 |
+
It seems to be either 1) the metrics are not measuring correctly due to parsing or 2) sign of overfitting answer, as it is not able to critque reasoning step.
|
20 |
+
|
21 |
+
||[kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged](https://huggingface.co/kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged)|[Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)|
|
22 |
+
|---|---|---|
|
23 |
+
|ProcessBench (GSM-8k)|**46.0**|36.5 (reported)|
|
24 |
+
|GSM-8k|78.54|**84.23**|
|
25 |
+
|
26 |
+
This needs to be investigated further.
|
27 |
+
I have done inspection of how **model behavior changed** after SFT. As expected, the finetuned model thinks much longer, and often solve a problem with multiple approaches instead of one.
|
28 |
+
|
29 |
+
### Quick Test 1: Is 9.11 greater than 9.9?
|
30 |
+
<details>
|
31 |
+
<summary> response from kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf </summary>
|
32 |
+
No, 9.11 is not greater than 9.9. Let's think about this step by step.
|
33 |
+
|
34 |
+
First, it helps to understand that both numbers are decimal numbers, specifically they have two digits after the decimal point. So, 9.11 and 9.9 can be compared directly based on their
|
35 |
+
values in each place value: whole number part and the tenths and hundredths parts.
|
36 |
+
|
37 |
+
Let's break them down:
|
38 |
+
|
39 |
+
- For 9.11:
|
40 |
+
- The whole number part is 9.
|
41 |
+
- The tenths place is 1 (which means one tenth).
|
42 |
+
- The hundredths place is also 1 (one hundredth).
|
43 |
+
|
44 |
+
- For 9.9:
|
45 |
+
- The whole number part is also 9.
|
46 |
+
- But here, the tenths place is 9 (nine tenths), and there's no hundredths digit after that; it's considered to have zero in the hundredths place because decimal numbers with fewer digits
|
47 |
+
are filled with zeros.
|
48 |
+
|
49 |
+
So, comparing them:
|
50 |
+
|
51 |
+
- Both have the same whole number part: 9.
|
52 |
+
- For the tenth place:
|
53 |
+
- 1 vs. 9.
|
54 |
+
|
55 |
+
Since 9 is greater than 1, this already tells us that 9.9 is larger than 9.11. But to be thorough, let's look at the hundredths places as well:
|
56 |
+
|
57 |
+
- In 9.11: hundredths = 1
|
58 |
+
- In 9.9: hundredths = 0
|
59 |
+
|
60 |
+
Even though there are no digits in the hundredths place for 9.9, it's still considered zero because of the way decimals work.
|
61 |
+
|
62 |
+
Therefore, overall, 9.11 < 9.9.
|
63 |
+
|
64 |
+
Wait a minute, but I should make sure that I'm not missing anything here. Sometimes, people might think about adding or subtracting numbers differently, but in this case, since both numbers
|
65 |
+
have the same whole number part and one has a larger tenths digit, it's clear which is larger.
|
66 |
+
|
67 |
+
Another way to look at it is to consider them as percentages:
|
68 |
+
|
69 |
+
- 9.11 = 911%
|
70 |
+
- 9.9 = 990%
|
71 |
+
|
72 |
+
But that might be confusing; in percentage terms, 990% is still greater than 911%.
|
73 |
+
|
74 |
+
Alternatively, if we multiply both numbers by 100 to eliminate the decimal point:
|
75 |
+
|
76 |
+
- 9.11 * 100 = 911
|
77 |
+
- 9.9 * 100 = 990
|
78 |
+
|
79 |
+
Again, 990 is greater than 911.
|
80 |
+
|
81 |
+
So, no matter how we look at it, 9.9 is larger than 9.11.
|
82 |
+
|
83 |
+
I think that covers the comparison thoroughly. There doesn't seem to be any trick or special case here; it's a straightforward decimal comparison once you understand place values.
|
84 |
+
|
85 |
+
**Final Answer**
|
86 |
+
|
87 |
+
\[ \boxed{9.11 < 9.9} \]
|
88 |
+
</details>
|
89 |
+
|
90 |
+
<details>
|
91 |
+
<summary> response from response from Qwen/Qwen2.5-7B-Instruct-GGUF </summary>
|
92 |
+
No, 9.11 is not greater than 9.9. When comparing decimal numbers, you can compare them digit by digit from left to right. In this case:
|
93 |
+
|
94 |
+
- The whole number part of both numbers is the same (9).
|
95 |
+
- Comparing the tenths place, we see that 1 in 9.11 is larger than 9 in 9.9.
|
96 |
+
- Therefore, 9.11 > 9.9.
|
97 |
+
|
98 |
+
So, 9.11 is greater than 9.9.
|
99 |
+
</details>
|
100 |
+
|
101 |
+
|
102 |
+
### Quick Test 2: How many letters R in the word strawberry?
|
103 |
+
<details>
|
104 |
+
<summary> response from kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf </summary>
|
105 |
+
Let's see. The word is "strawberry." I need to count how many 'R' letters are there in this word.
|
106 |
+
|
107 |
+
First, let me write down the word again: s-t-r-a-w-b-e-r-r-y.
|
108 |
+
|
109 |
+
Now, I'll go through each letter and check if it's an 'R'.
|
110 |
+
|
111 |
+
Starting with the first letter: S - that's not R.
|
112 |
+
|
113 |
+
Second letter: T - also not R.
|
114 |
+
|
115 |
+
Third letter: R - yes, that's an 'R'. So, one so far.
|
116 |
+
|
117 |
+
Fourth letter: A - nope.
|
118 |
+
|
119 |
+
Fifth letter: W - still not R.
|
120 |
+
|
121 |
+
Sixth letter: B - not R either.
|
122 |
+
|
123 |
+
Seventh letter: E - no.
|
124 |
+
|
125 |
+
Eighth letter: R - another 'R'! That makes two now.
|
126 |
+
|
127 |
+
Ninth letter: B - no R here.
|
128 |
+
|
129 |
+
Tenth letter: E - no R.
|
130 |
+
|
131 |
+
Eleventh letter: R - yet again, an 'R'. Now it's three.
|
132 |
+
|
133 |
+
Twelfth letter: Y - not R.
|
134 |
+
|
135 |
+
So, counting all the way through, I've found three 'R' letters in "strawberry."
|
136 |
+
|
137 |
+
Wait a minute, let me double-check. Maybe there are more 'R's that I missed.
|
138 |
+
|
139 |
+
Looking at "s-t-r-a-w-b-e-r-r-y." Yes, I did find three: 3rd, 8th, and 11th positions.
|
140 |
+
|
141 |
+
I think that's it. There don't seem to be any other 'R's in the word.
|
142 |
+
|
143 |
+
Therefore, there are 3 letters R in the word "strawberry."
|
144 |
+
|
145 |
+
**Final Answer**
|
146 |
+
|
147 |
+
\[ \boxed{3} \]
|
148 |
+
</details>
|
149 |
+
<details>
|
150 |
+
<summary> response from Qwen/Qwen2.5-7B-Instruct-GGUF </summary>
|
151 |
+
The letter "R" appears 1 time in the word "strawberry".
|
152 |
+
</details>
|
153 |
+
|
154 |
+
### Try out in Ollama!
|
155 |
+
```
|
156 |
+
ollama run hf.co/kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf
|
157 |
+
```
|
158 |
|
159 |
# Uploaded model
|
160 |
|
|
|
164 |
|
165 |
This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
|
166 |
|
167 |
+
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|