Getting jibberisk output on Q6
EDIT: This appears to be an OobaBooga issue. The Q6 quant works perfectly in Kobold.cpp.
I loaded up magnum-v2-123b-Q6_K-00001-of-00003.gguf (did not merge the split files) and asked it to analyze 700 words of text. It responded with a non-sensical response.
I normally have 120gb of VRAM...but a 3090 decided to go on the fritz last night. So, I replaced it with a 4080. I now have 112GB of VRAM. I set context in Oobabooga to 9k and ticked all the boxes you see in the image below. The only other thing I did was load up the stock Mistral template in Oobabooga.
Am I doing something wrong?
Ah shit, just realized I think I recall you saying you're on vacation for the next 2 weeks. Oh well, keeping this comment live on the off chance someone can help.
I happened to download q6k like 5 minutes ago, and I'm using it with koboldcpp. Seems to work fine, FWIW. Oh, just saw your edit, lol.
Well, anyway, wondering if you've found any sampler settings that you like with this model, etc.