Upgrade format of this model?
Hello Andrei, I work for NeuralMagic and I'm adding AQLM support to vLLM in an upcoming PR. Your llama 2 7b 1x16 and 2x8 models have no custom code and a quantization_config
block in the config.json
which is perfect. I'm able to run those models (and a tiny llama2 you have as well) end to end with no problems.
But this model, and the rest referenced in the readme have what look like an older format with a custom aqlm
block in the config.json
and custom code, making them not readable by vLLM. I was wondering, do you have plans to update those to the same standard as the first two? Or is that something I could try to do with a PR (if it's just a question of changing the config.json and removing the custom code.)
Thanks, -James
Indeed, I missed this model when updating checkpoints.
I've updated the format.
Thanks!
Thank you!