4bit gglm version?
#4
by
jbollenbacher
- opened
Hi team! Love your work on OpenAssistant!
Can we get these models ported to a 4bit gglm version? Especially the pythia-based models, but the llama ones would be nice too. This would make them much more portable and easy to experiment with.
Thanks!
Please see my repositories here and on github, GPT-NeoX based models (OpenAssistant StableLM and Pythia models) will not run in llama.cpp or ggml at the moment, but my fork of llama.cpp will. Best of luck!