jondurbin/airoboros-7b-gpt4-1.2 · Doesn't work for me with ExLlama loader only with transformers

I haven't used exllama, but taking a quick peek at the repo, it looks like it's intended for the 4-but GPTQ versions of the models. TheBloke has kindly quantized all of the GPTQ (and GGML) versions of all of these (7b through 65b). The 7b version is here:
https://huggingface.co/TheBloke/airoboros-7B-gpt4-1.2-GPTQ

I used qlora for all versions this time, rather than a full fine-tune, so the smaller 7b/13b models may be a bit worse than 1.1 versions for some prompts but I don't have any direct evidence for that.

The 33b and 65b versions perform quite well with qlora tuning however.