How to run it on a mobile device?

#1
by KoiSikhaDo - opened

A lot of the non-allenai blogs online say that the model is small enough to run on a mobile device. Just wanted to know if this can be done.
Can the model be quantised using bitsandbytes to make it smaller to run on a mobile device?

KoiSikhaDo changed discussion status to closed
KoiSikhaDo changed discussion status to open

Yes actually @soldni ran https://huggingface.co/allenai/OLMoE-1B-7B-0924-GGUF on a mobile device - not sure if there is a public guide about it somewhere?

We'll also need to merge MolmoE into llama.cpp first to make it work like the above one (i.e. a PR like https://github.com/ggerganov/llama.cpp/pull/9462)

Ai2 org

stay tuned, we are trying to get this one running!

@soldni Hello, are there any updates on this? I would like to test this model with vLLM.

Thank you very much for your amazing models!

Sign up or log in to comment