Model config.json has Mistral params instead of Mixtral, breaking ExLlama quants and maybe affecting others too
#3
by
TheBloke
- opened
I got reports that ExLlamav2 wasn't working with this GPTQ. Turns out that's because it's trying to load it as a Mistral model, which is due to the architecture
being set to Mistral instead of Mixtral
Also, the rope_theta
should be 1000000.0 for Mixtral - this can affect inference quality.
I don't think any of this would stop k-quants working though, so that issue might be unrelated. I'll try making some anyway though.
Undi95
changed pull request status to
merged