Unintentionally more stable than v4.

by Nitral-AI - opened Mar 27

Nitral's Archive org Mar 27

•

Pleasantly surprised with this one so far, seems to have decent results so far on my end. @Lewdiculous
I left the image the same its basically v4 with with mistral 0.2's config and weights between two sub merges combined into this one. Should have close to 32k w/o swa if everything went correctly.

Lewdiculous

Mar 27

•

edited Mar 27

It's all happy accidents

0.2 bringing good luck

Long context and vision?

The audacity!

Lewdiculous

Mar 27

•

edited Mar 27

@Nitral-AI It's quanting, unrelated but lemme sell you on something cool:

https://github.com/ajeetdsouza/zoxide

It's addicting and I can never use regular cd anymore after this. I recommend you full on replace your cd with it.

Lewdiculous

Mar 27

•

edited Mar 27

Quants at:
https://huggingface.co/Lewdiculous/Eris_PrimeV4-Vision-32k-7B-GGUF-IQ-Imatrix

Lewdiculous

Mar 27

•

edited Mar 27

@Nitral-AI – You have done it. Graced by the glory of Mistral 0.2, not even I am complaining about V4 (yet? KEKW). Congratulations!

As per feedback from LocalBasedMan, author of Erosumika, it seems the 0.2 base really benefits from very tame, low temperature sampling settings, to be the most stable while still not feeling particularly repetitive. I have uploaded presets I consider "good starting points" here:

https://huggingface.co/Lewdiculous/Model-Requests/tree/main/data/presets/lewdicu-3.0.2-mistral-0.2

For Eris V4: Formatting is good (Cards: Chiaki (multiple characters speak/act per response) - 170 tokens/response, Mesugaki Correction School (RPG style status information in responses) - 350 tokens/response), writing seems good, vision is good. Haven't tested intelligence particularly but she is adhering to the appropriate characters so I'll take that as a positive indication.

Nitral-AI

Nitral's Archive org Mar 27

Appreciate the feedback my dude, this one took way longer to pull off than i wanted, but im glad it worked out. Thank you as always for the quants as well, regarding intelligence its kind of hard for me to say. Its definitely not as smart as 3.05 raw, but it doesn't feel hugely off from 3.075.

Lewdiculous

Mar 27

•

edited Mar 27

@Nitral-AI I'm thinking about adding some explicit NSFW chats to the imatrix calibration data, some entries from the recent RP-NSFW-test database from the fellow Chaotics, I believe it's on Replacement. But anyways, do you have a way to eval quant quality, I was thinking about directly comparing 2 different Q4_K_M-imat quants of V4-32K – Or maybe the IQ3_M instead to look for a more dramatic difference.

Would you be able to measure things like KL divergence? Even though it's not a decisive score, but yeah... I just don't have this set up, so would like to see if you do.

Nitral-AI

Nitral's Archive org Mar 27

•

edited Mar 27

You could run perplexity from within llama.cpp on the quants, although i don't think that is the end all be all test generally speaking, nor in this case . @Lewdiculous

And no i dont have anything setup for gguf quants to test quant quality. I typically use the per layer accuracy score to f16 from the quant (you can see this when making exl2 quants) and average it over the 32 hidden layers. But even that isnt a perfect method, since it only tells you how accurate to f16 the quantl is as the method of comparison.

Lewdiculous

Mar 27

•

edited Mar 27

PPL will do. Unless something looks very wrong.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment