Nexesenex
/

TeeZee_Kyllene-Yi-34B-v1.1-iMat.GGUF

Inference Endpoints

Model card Files Files and versions Community

Nexesenex commited on Feb 2

Commit

368164c

•

1 Parent(s): ff0ccc2

Update README.md

Files changed (1) hide show

README.md +18 -3

README.md CHANGED Viewed

@@ -18,11 +18,26 @@ That make me wonder about the future, when we'll get Miqu 70b models properly fi
 Available quants :
-Q8_0, Q5_K_S, Q4_K_M, Q4_K_S, Q3_K_M, Q2_K
-To come in the week :
-IQ3_XXS, Q2_K_S, IQ2_XS, IQ2_XXS
 ---

 Available quants :
+Full offload possible on 48GB VRAM with a huge context size :
+    Q8_0
+Full offload possible on 36 GB VRAM with a huge context size :
+    Q5_K_S
+Full offload possible on 24GB VRAM with a big to huge context size (from 12288 with Q4_K_M, for example)
+    Q4_K_M, Q4_K_S, Q3_K_M
+Full offload possible on 16GB VRAM with a decent context size
+    IQ3_XXS SOTA otw (which is equivalent to a Q3_K_S with more context!), Q2_K, Q2_K_S otw
+Full offload possible on 12GB VRAM with a decent context size.
+    IQ2_XS SOTA otw
+    Lower quality : IQ2_XXS SOTA otw
 ---