Q3 and Q2 quants broken

by mpasila - opened Oct 16, 2023

Oct 16, 2023

Undi's own provided GGUF files seem to work fine but not these.
I keep getting errors when trying to load them in oobabooga's text generation webui. I tried both llamacpp and llamacpp_hf loaders and neither of them work.

Llamacpp loader error:

error loading model: create_tensor: tensor 'token_embd.weight' has wrong shape; expected  5120, 32001, got  5120, 32000,     1,     1
llama_load_model_from_file: failed to load model
Traceback (most recent call last):
  File "C:\Users\pasil\text-generation-webui\server.py", line 223, in <module>
    shared.model, shared.tokenizer = load_model(model_name)
  File "C:\Users\pasil\text-generation-webui\modules\models.py", line 79, in load_model
    output = load_func_map[loader](model_name)
  File "C:\Users\pasil\text-generation-webui\modules\models.py", line 225, in llamacpp_loader
    model, tokenizer = LlamaCppModel.from_pretrained(model_file)
  File "C:\Users\pasil\text-generation-webui\modules\llamacpp_model.py", line 91, in from_pretrained
    result.model = Llama(**params)
  File "C:\Users\pasil\anaconda3\envs\textgen\lib\site-packages\llama_cpp_cuda\llama.py", line 365, in __init__
    assert self.model is not None
AssertionError
Exception ignored in: <function LlamaCppModel.__del__ at 0x000001ABCECE8AF0>
Traceback (most recent call last):
  File "C:\Users\pasil\text-generation-webui\modules\llamacpp_model.py", line 49, in __del__
    self.model.__del__()
AttributeError: 'LlamaCppModel' object has no attribute 'model'

Llamacpp_hf loader error:

error loading model: create_tensor: tensor 'token_embd.weight' has wrong shape; expected  5120, 32001, got  5120, 32000,     1,     1
llama_load_model_from_file: failed to load model
Traceback (most recent call last):
  File "C:\Users\pasil\text-generation-webui\server.py", line 223, in <module>
    shared.model, shared.tokenizer = load_model(model_name)
  File "C:\Users\pasil\text-generation-webui\modules\models.py", line 79, in load_model
    output = load_func_map[loader](model_name)
  File "C:\Users\pasil\text-generation-webui\modules\models.py", line 250, in llamacpp_HF_loader
    model = LlamacppHF.from_pretrained(model_name)
  File "C:\Users\pasil\text-generation-webui\modules\llamacpp_hf.py", line 211, in from_pretrained
    model = Llama(**params)
  File "C:\Users\pasil\anaconda3\envs\textgen\lib\site-packages\llama_cpp_cuda\llama.py", line 365, in __init__
    assert self.model is not None
AssertionError

TheBloke

Owner Oct 16, 2023

Thanks for the report. It looks like another issue of @Undi95 's repo having a JSON misconfiguration. It has an added_tokens.json file which extends the vocab to 32,001, but it seems it's not needed.

I will remake the files and let you know when to download again.

TheBloke

Owner Oct 16, 2023

This comment has been hidden

TheBloke

Owner Oct 16, 2023

Actually, although the repo is misconfigured, when I try to make a new GGUF, it correctly ignores added_tokens.json. So it's possible I made a mistake when making the first quants, like maybe I edited my local files in the wrong way - I can't remember now what I did for this model specifically, but I can see I had to make some local edit to it.

Anyway, I've got them working now and the new upload will start in a moment.

TheBloke

Owner Oct 16, 2023

•

edited Oct 16, 2023

New quants are uploaded and are working fine

Q3_K_M:

system_info: n_threads = 15 / 30 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
sampling: repeat_last_n = 64, repeat_penalty = 1.100000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.800000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
generate: n_ctx = 4096, n_batch = 512, n_predict = -1, n_keep = 0


 Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
How much wood would a woodchuck chuck if a woodchuck could chuck wood?

### Response:
 A woodchuck, also known as a groundhog, is a herbivore and does not typically "chuck" (throw) wood. They are more likely to burrow into the ground or move leaves and debris with their strong front legs. However, if we were to assume that a woodchuck could chuck wood like an axe, it's impossible to determine how much wood they would be able to throw due to lack of information about their physical strength or any context regarding the size and type of wood being referred to. [end of text]

mpasila

Oct 16, 2023

yeah thanks now it's working for me as well.

mpasila changed discussion status to closed Oct 16, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment