why?

#1
by Utochi - opened

why mix 3.1 with 3.2? smh. guess ill check it out.

this is alright. its about as coherent as stheno v3.2 which is pretty decent but like, it doesnt feel like an overall improvement in coherency.
i will note its sentence structure though does seem to be better and a bit more "lucid" so to speak. that being said, if you have the Vram to load this, sure i think it may be an improvement over either one of the stheno models but the jump is not really worth it IMO unless you have the extra power to run it.

Ranking the model a 6 out of 10. due to its oversize and minimal improvement. i would much rather see stheno 3.2 go through one of those upscales that fim v2 and space whale have had

Hi Utochi,
I did the merge and yeah just an experiment, but I have found for some reason that 2x7/8B MoE models often outperform the individual dense versions. I agree that the improvement is not massive by any stretch but I do slightly prefer the MoE over either 3.1 or 3.2. I would also love an upscale like Fim v2, and I just uploaded a 15B passthrough merge that initially seems to perform well although YMMV. Thanks for the feedback,

Sign up or log in to comment