Please open mouth kiss the homies.

#1
by snombler - opened

This could be us but you playin'.
gareiyuri.gif

(Making exl2s before ggufs is a crime.)

llama.cpp's tokenization handling in the past two months is perhaps equally criminal

Not wrong! But until someone else wants to support split loading, it's all we've really got, sadly. Also, thanks for all your contributions.

tbh exl2 simply produces better outputs.

I am graciously willing to accept 3090s to run exl2s for anyone who has them to spare. I'll need enough to run at least 64k context.

Only see 2 bit exl but 4KM gguf. We got different definitions of "before"

It's just proof that bullying works.

Sign up or log in to comment