Could you quant maldv/badger-iota-llama-3-8b?
I actually tried this morning but failed, let me re-run it to see what the problem was.
not supported by llama.cpp, but maybe I can work around it
the pretokenizer is not supported by llama.cpp, I am forcing it to llama3, which is hopefully the closest match. quants should be incoming soon. Cheers!
Thanks man. I'll have to dig into this as well and see if I have something wrong in a config. I know when I rope scaled with 'dynamic' it would break lcpp, so I wonder if there is something lingering from that.
It's merely the pretokenizer, not the rope config. I don't know exactly whats wrong, but one possibility is that othe model you got the tokenizer config for is based on llama-3 before the applied a fix to their repo. Or after they applied a fix. (I think https://huggingface.co/meta-llama/Meta-Llama-3-8B hashed to 0ef9807a4087ebef797fc749390439009c3b9eda9ad1a097abbe738f486c01e5, which is what llama.cpp uses, and now it's c136ed14d01c2745d4f60a9596ae66800e2b61fa45643e72436041855ad4089d).
As such, everything might be fine :)
I see there is a bug report open for this (https://github.com/ggerganov/llama.cpp/issues/7069), but I guess it's being ignored, as llama.cpp thinks only "important" models matter.