Is this ExLlamaV2?

#1
by OrangeApples - opened

Hi @lucyknada . I loaded this up in Oobabooga and it selected Transformers as the loader by default. It still works with ExLlamaV2 loader but prompt processing takes much longer than normal exl2 quants. What is this?

yeah this is just a exl2-quant of: https://huggingface.co/cloudyu/Mixtral_34Bx2_MoE_60B I did for someone that asked about it

Thanks for clarifying. I must have botched some settings with caused the prompt processing to slow down then. It fits nicely into 24GB VRAM, so thanks for uploading this!

OrangeApples changed discussion status to closed

Sign up or log in to comment