GGUF
Not-For-All-Audiences
Inference Endpoints
conversational

Exception errors attempting to run in Koboldcpp

#3
by CherylF - opened

I've tried several quants but they never load in Koboldcpp, failing before populating gpu's.
I'm wondering if i should be using a batch file with extra commands. if so please let me know.
I have been using CPP for close to a year, so i really don't understand why this model is causing exceptions.
Exl2 works okay in Ooba.

BeaverLegacy org

Hi, have you tried the quants w/ or w/o the OLD tag? Maybe it OOM'd? Have you tried a lower context length? Have you tried updating KCPP?

Yeah latest version of KCPP. But I hadn't thought about the OLD quants. Anyway I'm downloading now and hopefully that will work.
Thank you for responding and help.

Not working for me either with the new quants and KCPP 1.64.1. Crashes immediately. Seems like a KCPP issue since quants from mradermacher and MarsupialAI crash the app as well

Edit: Above is confirmed. Quants load fine in Oobabooga. KCPP must be on an old version of llama.cpp

I think you'll have to wait until KoboldCPP updates this change:

https://github.com/LostRuins/koboldcpp/commit/889bdd76866ea31a7625ec2dcea63ff469f3e981

Sign up or log in to comment