create_tensor: tensor 'blk.0.ffn_gate.weight' not found

by Althenwolf - opened Dec 11, 2023

Dec 11, 2023

Hi!

I have the following error loading the mixtral-8x7b-v0.1.Q8_0.gguf:

llm_load_tensors: ggml ctx size = 0.32 MB
llm_load_tensors: using CUDA for GPU acceleration
ggml_cuda_set_main_device: using device 0 (NVIDIA GeForce RTX 3090) as main device
error loading model: create_tensor: tensor 'blk.0.ffn_gate.weight' not found
llama_load_model_from_file: failed to load model
2023-12-11 16:55:48 ERROR:Failed to load the model.

...

File "/env/lib/python3.10/site-packages/llama_cpp_cuda/llama.py", line 365, in init
assert self.model is not None
AssertionError

Any idea?
PD: TheBloke, many many thanks for your work and time!

userzyzz

Dec 11, 2023

You need to use the llama.cpp fork with mixtral support https://github.com/ggerganov/llama.cpp/tree/mixtral

jdc4429

Dec 12, 2023

Is there not a Windows 10 binary compiled for this???

dvijay

Dec 12, 2023

Hi, I'm using the right branch (latest pull from mixtral) but still getting the same error:

llm_load_tensors: ggml ctx size = 0.36 MiB
llm_load_tensors: using CUDA for GPU acceleration
error loading model: create_tensor: tensor 'blk.0.ffn_gate.weight' not found
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '/mnt/e/mixtral-8x7b-v0.1.Q4_K_M.gguf'
main: error: unable to load model

In the list of layers produced by the llama _model_loader from running main, I don't see this tensor. I only see tensors like blk.0.ffn_gate.0.weight
Am I missing anything here?

pudepiedj

Dec 12, 2023

I had the same issue on Apple Metal M2 MAX and it was solved by pulling the /mixtral branch instead of the master branch from llama.cpp then remaking. But you're right that the supposedly missing tensor doesn't appear in the list of created tensors even when it works!

dvijay

Dec 12, 2023

•

edited Dec 12, 2023

I tried a clean build multiple times but still no luck. Should the mixtral branch work as is or are there any additional changes or patches that are required? Any help is greatly appreciated.

llm_load_tensors: ggml ctx size = 0.36 MiB
llm_load_tensors: using CUDA for GPU acceleration
error loading model: create_tensor: tensor 'blk.0.ffn_gate.weight' not found
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '/mnt/models/mixtral-8x7b-v0.1.Q4_K_M.gguf'
main: error: unable to load model
.../llama.cpp/build$ git status
On branch mixtral
nothing to commit, working tree clean

dvijay

Dec 13, 2023

Final update: I got it working eventually. For some reason building from the branch wasn't working originally but after a few tries it works and I can load the models correctly.

thadoop

Dec 13, 2023

You need to use the llama.cpp fork with mixtral support https://github.com/ggerganov/llama.cpp/tree/mixtral

Can you please explain more, how do we do it ?

Nicoolodion

Dec 13, 2023

How do I make it use the newly downloaded llama.cpp (I also think we don't need to use the branch for that anymore since it was merged I believe) Where do I install it?

nbaughman

Dec 17, 2023

@thadoop @Nicoolodion

Clone the repo as per normal: git clone https://github.com/ggerganov/llama.cpp
Before you build, run git checkout mixtral
Build as per normal

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment