kenhktsui
/

qwen2.5-7b-instruct-thinking-sft-merged

@@ -10,7 +10,151 @@ tags:
 license: apache-2.0
 language:
 - en
 ---
 # Uploaded  model
@@ -20,4 +164,4 @@ language:
 This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 license: apache-2.0
 language:
 - en
+datasets:
+- kenhktsui/longtalk-cot-v0.1
 ---
+# kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged
+With 1 epoch of SFT, it boosts processbench GSM-8k subset by 26%. However, it results in worse result in GSM-8k.
+It seems to be either 1) the metrics are not measuring correctly due to parsing or 2) sign of overfitting answer, as it is not able to critque reasoning step.
+||[kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged](https://huggingface.co/kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged)|[Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)|
+|---|---|---|
+|ProcessBench (GSM-8k)|**46.0**|36.5 (reported)|
+|GSM-8k|78.54|**84.23**|
+This needs to be investigated further.
+I have done inspection of how **model behavior changed** after SFT. As expected, the finetuned model thinks much longer, and often solve a problem with multiple approaches instead of one.
+### Quick Test 1: Is 9.11 greater than 9.9?
+<details>
+  <summary> response from kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf </summary>
+No, 9.11 is not greater than 9.9. Let's think about this step by step.
+First, it helps to understand that both numbers are decimal numbers, specifically they have two digits after the decimal point. So, 9.11 and 9.9 can be compared directly based on their
+values in each place value: whole number part and the tenths and hundredths parts.
+Let's break them down:
+- For 9.11:
+  - The whole number part is 9.
+  - The tenths place is 1 (which means one tenth).
+  - The hundredths place is also 1 (one hundredth).
+- For 9.9:
+  - The whole number part is also 9.
+  - But here, the tenths place is 9 (nine tenths), and there's no hundredths digit after that; it's considered to have zero in the hundredths place because decimal numbers with fewer digits
+are filled with zeros.
+So, comparing them:
+- Both have the same whole number part: 9.
+- For the tenth place:
+  - 1 vs. 9.
+Since 9 is greater than 1, this already tells us that 9.9 is larger than 9.11. But to be thorough, let's look at the hundredths places as well:
+- In 9.11: hundredths = 1
+- In 9.9: hundredths = 0
+Even though there are no digits in the hundredths place for 9.9, it's still considered zero because of the way decimals work.
+Therefore, overall, 9.11 < 9.9.
+Wait a minute, but I should make sure that I'm not missing anything here. Sometimes, people might think about adding or subtracting numbers differently, but in this case, since both numbers
+have the same whole number part and one has a larger tenths digit, it's clear which is larger.
+Another way to look at it is to consider them as percentages:
+- 9.11 = 911%
+- 9.9 = 990%
+But that might be confusing; in percentage terms, 990% is still greater than 911%.
+Alternatively, if we multiply both numbers by 100 to eliminate the decimal point:
+- 9.11 * 100 = 911
+- 9.9 * 100 = 990
+Again, 990 is greater than 911.
+So, no matter how we look at it, 9.9 is larger than 9.11.
+I think that covers the comparison thoroughly. There doesn't seem to be any trick or special case here; it's a straightforward decimal comparison once you understand place values.
+**Final Answer**
+\[ \boxed{9.11 < 9.9} \]
+</details>
+<details>
+  <summary> response from response from Qwen/Qwen2.5-7B-Instruct-GGUF </summary>
+No, 9.11 is not greater than 9.9. When comparing decimal numbers, you can compare them digit by digit from left to right. In this case:
+- The whole number part of both numbers is the same (9).
+- Comparing the tenths place, we see that 1 in 9.11 is larger than 9 in 9.9.
+- Therefore, 9.11 > 9.9.
+So, 9.11 is greater than 9.9.
+</details>
+### Quick Test 2: How many letters R in the word strawberry?
+<details>
+<summary> response from kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf  </summary>
+Let's see. The word is "strawberry." I need to count how many 'R' letters are there in this word.
+First, let me write down the word again: s-t-r-a-w-b-e-r-r-y.
+Now, I'll go through each letter and check if it's an 'R'.
+Starting with the first letter: S - that's not R.
+Second letter: T - also not R.
+Third letter: R - yes, that's an 'R'. So, one so far.
+Fourth letter: A - nope.
+Fifth letter: W - still not R.
+Sixth letter: B - not R either.
+Seventh letter: E - no.
+Eighth letter: R - another 'R'! That makes two now.
+Ninth letter: B - no R here.
+Tenth letter: E - no R.
+Eleventh letter: R - yet again, an 'R'. Now it's three.
+Twelfth letter: Y - not R.
+So, counting all the way through, I've found three 'R' letters in "strawberry."
+Wait a minute, let me double-check. Maybe there are more 'R's that I missed.
+Looking at "s-t-r-a-w-b-e-r-r-y." Yes, I did find three: 3rd, 8th, and 11th positions.
+I think that's it. There don't seem to be any other 'R's in the word.
+Therefore, there are 3 letters R in the word "strawberry."
+**Final Answer**
+\[ \boxed{3} \]
+</details>
+<details>
+<summary> response from Qwen/Qwen2.5-7B-Instruct-GGUF </summary>
+The letter "R" appears 1 time in the word "strawberry".
+</details>
+### Try out in Ollama!
+```
+ollama run hf.co/kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf
+```
 # Uploaded  model
 This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
+[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)