Problem Model

by ClaudioItaly - opened Jul 2, 2024

Discussion

ClaudioItaly

Jul 2, 2024

this model is broken. He starts making endless signs. I've tried every way, I give up

ooooz

Jul 2, 2024

It's not broken for me, you need the last llama.cpp.

bartowski

Owner Jul 2, 2024

What he said ^

And if you're still having issues, make sure flash attention is off

ClaudioItaly

Jul 3, 2024

What he said ^

And if you're still having issues, make sure flash attention is off

Lm Studio updated the ccp yesterday and the problem remains even without flah

AndrewLockhart

Jul 3, 2024

Same. same. broken ggufs

bartowski

Owner Jul 3, 2024

I'm not seeing any issues either on lmstudio 0.2.27, can you guys share your hardware and settings?

Tested on windows with a 3070 and linux with a 3090

ooooz

Jul 4, 2024

No issue here and i even tried it in french for a full 8k tokens chat.
windows 10 with a 3090 ti llama.cpp + sillytavern as a front end and also tried with kobold.cpp (q8 quant)

AndrewLockhart

Jul 4, 2024

•

edited Jul 4, 2024

I tried this https://huggingface.co/legraphista/Gemma-2-9B-It-SPPO-Iter3-IMat-GGUF and it works fine, so idk what is wrong with these ggufs, I am using koboldcpp latest. And for me, it doesn't even load the model, errors occur.

bartowski

Owner Jul 4, 2024

That's very interesting since that quant from @legraphista (tagged so you can consider updating) is using a version of llama.cpp with a broken Gemma 2 implementation, so your experience should only be better on mine. What's the error you get @AndrewLockhart ?