bartowski/magnum-v2-123b-GGUF · Getting jibberisk output on Q6

EDIT: This appears to be an OobaBooga issue. The Q6 quant works perfectly in Kobold.cpp.

I loaded up magnum-v2-123b-Q6_K-00001-of-00003.gguf (did not merge the split files) and asked it to analyze 700 words of text. It responded with a non-sensical response.

I normally have 120gb of VRAM...but a 3090 decided to go on the fritz last night. So, I replaced it with a 4080. I now have 112GB of VRAM. I set context in Oobabooga to 9k and ticked all the boxes you see in the image below. The only other thing I did was load up the stock Mistral template in Oobabooga.

Am I doing something wrong?

Ah shit, just realized I think I recall you saying you're on vacation for the next 2 weeks. Oh well, keeping this comment live on the off chance someone can help.