llama.cpp support?
#6
by
ct-2
- opened
Is there a way to run this on RAM or via disc with transformers in 4bit? Thanks!
Support is being worked on at llama.cpp, follow the issue at https://github.com/ggerganov/llama.cpp/issues/6877. That requires not only support for the model, but someone to actually up and make quantizations, which will also take a very long time considering the size of the model (and be wildly impractical for most users).