kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged

With 1 epoch of SFT, it boosts processbench GSM-8k subset by 26%. However, it results in worse result in GSM-8k. It seems to be either 1) the metrics are not measuring correctly due to parsing or 2) sign of overfitting answer, as it is not able to critque reasoning step.

kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged Qwen/Qwen2.5-7B-Instruct
ProcessBench (GSM-8k) 46.0 36.5 (reported)
GSM-8k 78.54 84.23

This needs to be investigated further. I have done inspection of how model behavior changed after SFT. As expected, the finetuned model thinks much longer, and often solve a problem with multiple approaches instead of one.

Quick Test 1: Is 9.11 greater than 9.9?

response from kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf No, 9.11 is not greater than 9.9. Let's think about this step by step.

First, it helps to understand that both numbers are decimal numbers, specifically they have two digits after the decimal point. So, 9.11 and 9.9 can be compared directly based on their values in each place value: whole number part and the tenths and hundredths parts.

Let's break them down:

  • For 9.11:

    • The whole number part is 9.
    • The tenths place is 1 (which means one tenth).
    • The hundredths place is also 1 (one hundredth).
  • For 9.9:

    • The whole number part is also 9.
    • But here, the tenths place is 9 (nine tenths), and there's no hundredths digit after that; it's considered to have zero in the hundredths place because decimal numbers with fewer digits are filled with zeros.

So, comparing them:

  • Both have the same whole number part: 9.
  • For the tenth place:
    • 1 vs. 9.

Since 9 is greater than 1, this already tells us that 9.9 is larger than 9.11. But to be thorough, let's look at the hundredths places as well:

  • In 9.11: hundredths = 1
  • In 9.9: hundredths = 0

Even though there are no digits in the hundredths place for 9.9, it's still considered zero because of the way decimals work.

Therefore, overall, 9.11 < 9.9.

Wait a minute, but I should make sure that I'm not missing anything here. Sometimes, people might think about adding or subtracting numbers differently, but in this case, since both numbers have the same whole number part and one has a larger tenths digit, it's clear which is larger.

Another way to look at it is to consider them as percentages:

  • 9.11 = 911%
  • 9.9 = 990%

But that might be confusing; in percentage terms, 990% is still greater than 911%.

Alternatively, if we multiply both numbers by 100 to eliminate the decimal point:

  • 9.11 * 100 = 911
  • 9.9 * 100 = 990

Again, 990 is greater than 911.

So, no matter how we look at it, 9.9 is larger than 9.11.

I think that covers the comparison thoroughly. There doesn't seem to be any trick or special case here; it's a straightforward decimal comparison once you understand place values.

Final Answer

[ \boxed{9.11 < 9.9} ]

response from response from Qwen/Qwen2.5-7B-Instruct-GGUF No, 9.11 is not greater than 9.9. When comparing decimal numbers, you can compare them digit by digit from left to right. In this case:
  • The whole number part of both numbers is the same (9).
  • Comparing the tenths place, we see that 1 in 9.11 is larger than 9 in 9.9.
  • Therefore, 9.11 > 9.9.

So, 9.11 is greater than 9.9.

Quick Test 2: How many letters R in the word strawberry?

response from kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf Let's see. The word is "strawberry." I need to count how many 'R' letters are there in this word.

First, let me write down the word again: s-t-r-a-w-b-e-r-r-y.

Now, I'll go through each letter and check if it's an 'R'.

Starting with the first letter: S - that's not R.

Second letter: T - also not R.

Third letter: R - yes, that's an 'R'. So, one so far.

Fourth letter: A - nope.

Fifth letter: W - still not R.

Sixth letter: B - not R either.

Seventh letter: E - no.

Eighth letter: R - another 'R'! That makes two now.

Ninth letter: B - no R here.

Tenth letter: E - no R.

Eleventh letter: R - yet again, an 'R'. Now it's three.

Twelfth letter: Y - not R.

So, counting all the way through, I've found three 'R' letters in "strawberry."

Wait a minute, let me double-check. Maybe there are more 'R's that I missed.

Looking at "s-t-r-a-w-b-e-r-r-y." Yes, I did find three: 3rd, 8th, and 11th positions.

I think that's it. There don't seem to be any other 'R's in the word.

Therefore, there are 3 letters R in the word "strawberry."

Final Answer

[ \boxed{3} ]

response from Qwen/Qwen2.5-7B-Instruct-GGUF The letter "R" appears 1 time in the word "strawberry".

Try out in Ollama!

ollama run hf.co/kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf

Uploaded model

  • Developed by: kenhktsui
  • License: apache-2.0
  • Finetuned from model : unsloth/qwen2.5-7b-instruct-bnb-4bit

This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
33
Safetensors
Model size
7.62B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged

Collection including kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged