Issue with the tokenizer
This model causes a crash when the input includes "<|im_end|>" or "<|im_start|>" tokens.
- Nous-mixtral model added two new tokens for the ChatML format, expanding the model's vocab_size and the input embeddings tensor dimension by two, from 32000 to 32002.
- This model copied the tokenizer files from the Nous model, but the actual input embeddings are still vanilla size, so when you use those tokens in the prompt, you get a "index out of range" error
To fix, replace the tokenizer.json and tokenizer_config.json files with the ones from the base Mixtral and delete added_tokens.json file.
Thanks for the heads up! I have made the change you suggested and hopefully this fixes the crash. (Due to hardware constraints I have to convert everything to .gguf to use it and can't actually test the HF/FP16 version properly).
Something is still wrong with this model config, I can't use the grammar feature because of it:
https://github.com/oobabooga/text-generation-webui/issues/5369
I used this quant, maybe the problem is there?
https://huggingface.co/Artefact2/SensualNousInstructDARETIES-CATA-LimaRP-ZlossDT-SLERP-8x7B-GGUF/blob/main/SensualNousInstructDARETIES-CATA-LimaRP-ZlossDT-SLERP-8x7B-Q5_K_S.gguf
Edit: I also tried with this quant (that doesn't use iMatrix) and the problem persists
https://huggingface.co/Envoid/SensualNousInstructDARETIES-CATA-LimaRP-ZlossDT-SLERP-8x7B-GGUF/blob/main/SensualNousInstructDARETIES-CATA-LimaRP-ZlossDT-SLERP-8x7B-q6_K.gguf