(Special Stop Token Triggered! ID:2)
how do i stop this from triggering? my output gets interrupted prematurely. im using koboldcpp and sillytavern in their recent versions.
Is this in KoboldCPP's logs, or SillyTavern's?
it's in koboldcpp's
Hmmm
The token with ID 2 is "</s>", the end-of-string token for Mistral Nemo's default format. Are your Context Template and Instruct Template set to ChatML (or a ChatML variant?)
Also, where did you get the GGUF file from, and does their version of the ChatML-ified Mistral Nemo give you similar trouble?
i got the gguf file from mradermacher and im using his imatrix quant. when i set the context template to alpaca it doesnt triggered it that much. i mainly get this interruption when im using either chatml or mistral context template.
I see.
If you wouldn't mind, we could try using Featherless as the backend. This way we can narrow it down to either your ST setup or KoboldCPP/your quant file.
If that's cool, I can send you a temporary key. Usage won't cost me anything since it's a subscription, but it'll count towards my concurrent requests so I'd revoke it once you're done.
For the record, I looked at the tokenizer files, and they're the same as the ChatMLified Mistral Nemo, so if it is a problem with the tokenizer, it's at least not my fault. :P I do suspect the backend though. This test would definitively eliminate one or the other. If FL works, I can recommend trying a different quant. (Or you can just do that anyway. Maybe that's a better idea!)
Did you ever get this figured out? Don't wanna leave you hanging.