There is progress!
There is an improvement in the logic and reasoning tests from 8.57/12 to 9.13/12 at 0.5 temperature, in many repetitions and averaged result. Very nice result and slightly better than the standard qwen2.5. The model performs better (on average) in logic and reasoning at temperature 0.5 than at temperature 0, this was not the case in the previous version - there was a decrease in the model's capabilities, not an increase. This is visible in prose (which is after all written at a higher temperature than 0), even at a temperature of 0.9 the model logically develops threads and writes dialogues, with a lot of creativity - it is enough to set some reasonable min_p (like 0.05-0.1). Nice improvement.
A very nice alternative to nemo and especially llama-3 -8b
Thank you for the interesting model.