Can't host with vLLM - "LlamaModel" architecture is not supported.

#4
by huggingfacemotnt - opened

This model was fine tuned on codellama/CodeLlama-70b-hf, which has this as the architecture in it's config.json:

"LlamaForCausalLM"

The config.json for this model has this as the architecture:

"LlamaModel"

This causes an error when hosting with vLLM, as it only supports LlamaForCausalLM. I tried changing the config.json for sqlcoder-70b-alpha to indicate "LlamaForCausalLM", but this gave a KeyError:

KeyError: 'layers.11.input_layernorm.weight'

Is this a bug, or is this intentionally a different architecture? If it's not a bug, it seems vLLM cannot host this model, even though it supports LlamaForCausalLM from CodeLlama. Is there a recommended way, through something like vLLM or TGI...etc, to host this model?

Defog.ai org

Hi there, we discovered a bizzare bug where the model's lm_head.weight was not uploaded to HF in the upload process. This is causing many integrations to break, and the model uploaded here is producing gibberish results.

Fix coming soon – hopefully in the next hour

Defog.ai org
edited Jan 31

Fixed with a reupload of the model weights! Apologies for the issue. You'll unfortunately have to re-download the model weights first (run rm ~/.cache/huggingface/hub/models--defog--sqlcoder-70b-alpha). Should work great after that

rishdotblog changed discussion status to closed

Sign up or log in to comment