Inference:

./llama-qwen2vl-cli -m Q8_0.gguf --mmproj qwen2vl-vision.gguf -p "Describe this image." --image "demo.jpg"

Converted using this Colab Notebook:

HimariO for the excellent work on enabling quantization for Qwen2-VL! PR on GitHub

GGUF

Model size

1.54B params

Architecture

qwen2vl

4-bit

8-bit

16-bit

Inference API

Unable to determine this model's library. Check the docs .

Base model

Qwen/Qwen2-VL-2B

Finetuned

Quantized

(19)

this model