Some benchmarks
Your model scores the highest among Largetral finetunes(not merges) on UGI and my benchmark. Good work.
In my personal experience it feels a bit dumber than the official, but less than the other community tunes. It is also got hornier and better at negativity. Feels almost worth the sacrifice in intelligence.
Thanks! I've got one more trick up my sleeve that might bring Behemoth v2 closer to OG Largestral.
Using it for a few days, this is my favorite model for writing, and it's still smart enough to have loaded for coding/work, etc. Whatever you did with your slop removal experiments on the smaller models is working.
@gghfez I haven't used the slop removal on anything but Nemo yet xD
I'll try it on Cydonia soon.
Ah okay. I haven't used this it "role-playing" but I'm finding it's great at "write X in the style of " style prompts.
Prompt: ""Write a story based on Battlestar Galactica in the prose of Haruki Murakami from the perspective of Gius Baltar""
The Behemoth story is the only one which feels like a Murakami novel but also understands the character in the sci fi series I referenced.
Mistral-Large on the other hand, feels like a Mistral-Large story with it's "hushed corridors".
@gghfez wow, that's actually pretty good. did you use metharme or mistral?
Mistral.
Generally I've noticed that with these finetunes of Instruct models; if you use the original template, the prose/voice changes still come through.
This model has become my favorite multi-purpose tool. Subjectively, it is the best balance of creativity and smarts available today. It has become my current 'daily driver' ... well done