Feels like a step back from 3.2
Going straight from 3.2 to 3.3 I've noticed a significant loss in quality. I'm running q6 at 16k context with the exact settings I use for 3.2 with the most recent kcpp. I've found that Failure to follow format has increased, incredibly short or incredibly long replies are more common, and sometimes it will just ignore the prompt and either parrot from context, or make up a new prompt to follow.
Can confirm degradation too. I'm running Q8 with 16k context and model response is awful.
Can confirm degradation too. I'm running Q8 with 16k context and model response is awful.
Same.
Author feedback:
@Sao10K
I think I'm one of the few who doesn't completely agree with that. I would say it's like taking one steps forward and two step back. This is the first small model that was able to interpret two characters in the same card coherently, which is a step forward. The step back is the repetition and the inability to follow the prompt 100%, but I still see room for improvement. Overall, it's the first step in the right direction.