Great work.

#1
by RedLeader721 - opened

Any chance for a 4 bit ggml .bin model or is that asking too much?

Analytics Club at ETH Zürich org

Thank you! IIRC this is not yet possible but being worked on, see here for details: https://github.com/ggerganov/llama.cpp/issues/1333

I'll 2nd that, great work and many thanks!

I've been unable to run the model on my local PC for similar reasons (only 16 GB of RAM), but thanks to this I was able to load it on my RTX A4000.

Analytics Club at ETH Zürich org

hey just closing this out as it seems like your prayers may have been answered!

I don't see any checkpoints on the hub at time of writing, but do let me know if you add any :)

pszemraj changed discussion status to closed

Sign up or log in to comment