vllm (installed from main branch) doesn't like this model

#1
by Gershwin69 - opened
ValueError: torch.bfloat16 is not supported for quantization method awq. Supported dtypes: [torch.float16]

Ah OK. This likely needs to be reported to the vLLM team.

Can you try editing the config.json locally to change the dtype to float16 and see if it loads OK then.

That worked and I've mentioned it on their discord server.

prompt token ids: None.

Sign up or log in to comment