File size: 5,711 Bytes
2ae5871 4bbd317 2ae5871 36c5712 2ae5871 36c5712 2ae5871 36c5712 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 |
---
base_model: unsloth/qwen2.5-7b-instruct-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- trl
- sft
license: apache-2.0
language:
- en
datasets:
- kenhktsui/longtalk-cot-v0.1
---
# kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged
With 1 epoch of SFT, it boosts processbench GSM-8k subset by 26%. However, it results in worse result in GSM-8k.
It seems to be either 1) the metrics are not measuring correctly due to parsing or 2) sign of overfitting answer, as it is not able to critque reasoning step.
||[kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged](https://huggingface.co/kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged)|[Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)|
|---|---|---|
|ProcessBench (GSM-8k)|**46.0**|36.5 (reported)|
|GSM-8k|78.54|**84.23**|
This needs to be investigated further.
I have done inspection of how **model behavior changed** after SFT. As expected, the finetuned model thinks much longer, and often solve a problem with multiple approaches instead of one.
### Quick Test 1: Is 9.11 greater than 9.9?
<details>
<summary> response from kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf </summary>
No, 9.11 is not greater than 9.9. Let's think about this step by step.
First, it helps to understand that both numbers are decimal numbers, specifically they have two digits after the decimal point. So, 9.11 and 9.9 can be compared directly based on their
values in each place value: whole number part and the tenths and hundredths parts.
Let's break them down:
- For 9.11:
- The whole number part is 9.
- The tenths place is 1 (which means one tenth).
- The hundredths place is also 1 (one hundredth).
- For 9.9:
- The whole number part is also 9.
- But here, the tenths place is 9 (nine tenths), and there's no hundredths digit after that; it's considered to have zero in the hundredths place because decimal numbers with fewer digits
are filled with zeros.
So, comparing them:
- Both have the same whole number part: 9.
- For the tenth place:
- 1 vs. 9.
Since 9 is greater than 1, this already tells us that 9.9 is larger than 9.11. But to be thorough, let's look at the hundredths places as well:
- In 9.11: hundredths = 1
- In 9.9: hundredths = 0
Even though there are no digits in the hundredths place for 9.9, it's still considered zero because of the way decimals work.
Therefore, overall, 9.11 < 9.9.
Wait a minute, but I should make sure that I'm not missing anything here. Sometimes, people might think about adding or subtracting numbers differently, but in this case, since both numbers
have the same whole number part and one has a larger tenths digit, it's clear which is larger.
Another way to look at it is to consider them as percentages:
- 9.11 = 911%
- 9.9 = 990%
But that might be confusing; in percentage terms, 990% is still greater than 911%.
Alternatively, if we multiply both numbers by 100 to eliminate the decimal point:
- 9.11 * 100 = 911
- 9.9 * 100 = 990
Again, 990 is greater than 911.
So, no matter how we look at it, 9.9 is larger than 9.11.
I think that covers the comparison thoroughly. There doesn't seem to be any trick or special case here; it's a straightforward decimal comparison once you understand place values.
**Final Answer**
\[ \boxed{9.11 < 9.9} \]
</details>
<details>
<summary> response from response from Qwen/Qwen2.5-7B-Instruct-GGUF </summary>
No, 9.11 is not greater than 9.9. When comparing decimal numbers, you can compare them digit by digit from left to right. In this case:
- The whole number part of both numbers is the same (9).
- Comparing the tenths place, we see that 1 in 9.11 is larger than 9 in 9.9.
- Therefore, 9.11 > 9.9.
So, 9.11 is greater than 9.9.
</details>
### Quick Test 2: How many letters R in the word strawberry?
<details>
<summary> response from kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf </summary>
Let's see. The word is "strawberry." I need to count how many 'R' letters are there in this word.
First, let me write down the word again: s-t-r-a-w-b-e-r-r-y.
Now, I'll go through each letter and check if it's an 'R'.
Starting with the first letter: S - that's not R.
Second letter: T - also not R.
Third letter: R - yes, that's an 'R'. So, one so far.
Fourth letter: A - nope.
Fifth letter: W - still not R.
Sixth letter: B - not R either.
Seventh letter: E - no.
Eighth letter: R - another 'R'! That makes two now.
Ninth letter: B - no R here.
Tenth letter: E - no R.
Eleventh letter: R - yet again, an 'R'. Now it's three.
Twelfth letter: Y - not R.
So, counting all the way through, I've found three 'R' letters in "strawberry."
Wait a minute, let me double-check. Maybe there are more 'R's that I missed.
Looking at "s-t-r-a-w-b-e-r-r-y." Yes, I did find three: 3rd, 8th, and 11th positions.
I think that's it. There don't seem to be any other 'R's in the word.
Therefore, there are 3 letters R in the word "strawberry."
**Final Answer**
\[ \boxed{3} \]
</details>
<details>
<summary> response from Qwen/Qwen2.5-7B-Instruct-GGUF </summary>
The letter "R" appears 1 time in the word "strawberry".
</details>
### Try out in Ollama!
```
ollama run hf.co/kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf
```
# Uploaded model
- **Developed by:** kenhktsui
- **License:** apache-2.0
- **Finetuned from model :** unsloth/qwen2.5-7b-instruct-bnb-4bit
This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) |