Upload mmproj-Qwen2-VL-2B-Instruct-f16.gguf

More options for mmproj quantization would be better, I think. F32 mmproj is still fairly large, and from my limited testing, f16 seem to perform perfectly fine.

Files changed (2) hide show

.gitattributes +1 -0
mmproj-Qwen2-VL-2B-Instruct-f16.gguf +3 -0

.gitattributes CHANGED Viewed

@@ -57,3 +57,4 @@ Qwen2-VL-2B-Instruct-IQ2_M.gguf filter=lfs diff=lfs merge=lfs -text
 Qwen2-VL-2B-Instruct-f16.gguf filter=lfs diff=lfs merge=lfs -text
 Qwen2-VL-2B-Instruct.imatrix filter=lfs diff=lfs merge=lfs -text
 mmproj-Qwen2-VL-2B-Instruct-f32.gguf filter=lfs diff=lfs merge=lfs -text

 Qwen2-VL-2B-Instruct-f16.gguf filter=lfs diff=lfs merge=lfs -text
 Qwen2-VL-2B-Instruct.imatrix filter=lfs diff=lfs merge=lfs -text
 mmproj-Qwen2-VL-2B-Instruct-f32.gguf filter=lfs diff=lfs merge=lfs -text
+mmproj-Qwen2-VL-2B-Instruct-f16.gguf filter=lfs diff=lfs merge=lfs -text

mmproj-Qwen2-VL-2B-Instruct-f16.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9b160fd199d84c17b25093c47ba0b4119f46f030d23f63e6716e91a88c9f3cd6
+size 1331656192