neuralmagic/Sparse-Llama-3.1-8B-2of4

13 days ago

This is cool, I'm wondering if I can apply a LoRA adapter to the model though? Say with vLLM? Thanks

Neural Magic org 13 days ago

We are working on making the compressed-tensors models (2:4 sparsity and beyon) compatible with training LoRAs through HF PEFT. Once this is done you can deploy the resulting LoRA adapters with VLLM. Integrating with PEFT is still a work in progress but our integration with HFQuantizer is most of the work, just need the last mile to iron out the user stories. We do not have the bandwidth to work on this for the next month or so, but would love to collaborate + can provide guidance if this is something you wanted to work on.

RonanMcGovern

10 days ago

That's very cool. Thanks

RonanMcGovern changed discussion status to closed 10 days ago

neuralmagic
/

Sparse-Llama-3.1-8B-2of4

Can I apply a LoRA?