RAM requirements for running Llama-3.3-70B-Instruct-Q5_K_M.gguf

#4
by hyadav22 - opened

I have a server with 250 GB of RAM but no GPU. I attempted to run the model Llama-3.3-70B-Instruct-Q5_K_M.gguf, but it failed to load. I’d like to know the memory requirements for running other Unsloft quantized models, such as:

Llama-3.3-70B-Instruct-Q2_K.gguf
Llama-3.3-70B-Instruct-Q3_K_M.gguf

I have a server with 250 GB of RAM but no GPU. I attempted to run the model Llama-3.3-70B-Instruct-Q5_K_M.gguf, but it failed to load. I’d like to know the memory requirements for running other Unsloft quantized models, such as:

Llama-3.3-70B-Instruct-Q2_K.gguf
Llama-3.3-70B-Instruct-Q3_K_M.gguf

Should definitely work not sure why it's not. Did you try the other GGUF's as well and see if it works?

Also did you enable offloading?

Sign up or log in to comment