ockerman0/MN-12B-Starcannon-v5-unofficial

Aug 19

Thanks for the maintenance inheritance of an excellent model (I believe) Starcannon.

I have little knowledge of LLM, only using it lightly in Lhama.cpp, so please listen to me half-heartedly, but I think that v4 was more straightforward in its response to commands than v5, vocabulary notwithstanding.
Or maybe v5 is more severe for setting parameters such as temperature or chat templates. (I was stuck on ZeroGPU Quota🤢 and could only try MISTRAL and ChatML with some temperature)

ockerman0

Owner Aug 19

•

edited Aug 19

Thanks for the feedback! I'm not very versed in LLMs either (this is more just me dipping my toes in the rabbit hole more than anything). For my use case (roleplaying) I seem to prefer the feel of this one slightly more, though whether or not it's placebo time will tell. I've had some ideas from the people in Celsete's discord so I'm going to have another play around and see if I can get anything better.

John6666

Aug 19

Thank you for your reply.

I see. By the way, my use case is to have LLM translate Japanese into English, write a short story that expands on it, and then convert it into English tags for the image generation AI.
So I am trying many RP models that are close to the required capabilities.
It would be absurd to have a single LLM about the size of an 8B to do this, so it is completely for fun, but I am surprised at how well it often works.
(If I really want to make this work perfectly, I should have multiple AIs and support algorithms share the work!)

The Unofficial Starcannon v4 also produced a practical output with MISTRAL.
I don't know why the just a formatting makes a difference (because the original model was for a different purpose?). The Lhama and MISTRAL 8B models tend to have a firm response in many cases. The ChatML model, on the other hand, tends to be more rambunctious.

I will continue to play with your models when you upgrade them.😀

ockerman0
/

MN-12B-Starcannon-v5-unofficial

A feeling of use