4-bit quant?

by Neman - opened Mar 14, 2024

Mar 14, 2024

•

edited Mar 14, 2024

Hi! Thank you for releasing this multimodal model. First test are impressive. Even 1.3B is good for its size.
It is just that 7b version in full precision is still taxing on personal HW we have at home.
Would it be possible to quantize it to int4 like Qwen did with their Qwen-VL-Chat-Int4?
I think it would be best if you could do it and put it here in your repo so community can use it.
If not, maybe you could give us some guidelines how to do it.

luofuli

Mar 19, 2024

@Neman
We don't have plans to work on quantizing the model; you might want to wait for other community members to tackle it.

Neman

Mar 21, 2024

Thank you for answer.

Neman changed discussion status to closed Mar 21, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment