Upload mmproj-Qwen2-VL-2B-Instruct-f16.gguf

#1
by stduhpf - opened

More options for mmproj quantization would be better, I think. F32 mmproj is still fairly large, and from my limited testing, f16 seem to perform perfectly fine.

yeah i wasn't sure about this one, simply because the qwen2vl code defaults to f32 so i thought maybe there was something important about it..

It's better to allow more options. From what I've tested, F16 works just fine as well. I've used the same prompt and image to test them, and it came out identical (maybe very minor differences, but it didn't give incorrect answers). Cheers!

Yeah thanks for confirming! Uploaded my own just so I can guarantee origin, but appreciate the help :)

bartowski changed pull request status to closed

Sign up or log in to comment