llama.cpp tokenization bug

I'll redo them as per demand, at least the most popular and well performing models, and will add a Notice to the ones that still need to be updated. KoboldCpp has to get upstream features for its users to be able fo actually benefit from the fixes and there's still a potential issue to be solved:

https://github.com/ggerganov/llama.cpp/issues/6914

Lewdiculous

Owner Apr 30

Can you recreate your quantization with the fixed commit?

@FlareRebellion - Will do these quants again and reupload. You'll still have to wait for KCPP 1.64 release to get the benefits but quants will at least already be ready.

Lewdiculous changed discussion status to closed Apr 30

Lewdiculous

Owner Apr 30

Issues seem to getting fixed already, using latest llamacpp:
https://github.com/ggerganov/llama.cpp/issues/6914#issuecomment-2084315900

Lewdiculous

Owner Apr 30

•

edited Apr 30

Facing issues with Aurora's tokenizer... Will wait some more to look into it, might be another issue.

Lewdiculous changed discussion status to open Apr 30

Lewdiculous

Owner Apr 30

@FlareRebellion For now I'll recommend you check out https://huggingface.co/Lewdiculous/Chaos_RP_l3_8B-GGUF-IQ-Imatrix, which should be as good as Aurora or better and I was able to re-quant it properly. I'll talk to the author about Aurora.

Lewdiculous changed discussion status to closed Apr 30

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment