Text Generation
Transformers
GGUF
code
llama-cpp
gguf-my-repo
Eval Results
Inference Endpoints

Didn't work for me

#2
by segmond - opened

...
llm_load_tensors: ggml ctx size = 3.54 MiB
llama_model_load: error loading model: check_tensor_dims: tensor 'output.weight' not found
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model './granite-34b-code-instruct.Q8_0.gguf'
main: error: unable to load model

$ sha256sum granite-34b-code-instruct.Q8_0.gguf
5068ab261d8acb474300597beb3674712f48d15ae807acdacb2ae5ab70742a7c granite-34b-code-instruct.Q8_0.gguf

https://github.com/ggerganov/llama.cpp/issues/7116 Still waiting for support in llama.cpp.

YorkieOH10 changed discussion status to closed

Ok, thanks, you should probably add that on the model card and the link to the branch with support for those adventurous.

Sign up or log in to comment