Update README.md
Browse files
README.md
CHANGED
@@ -19,7 +19,7 @@ Full offload possible on 48GB VRAM with a huge context size :
|
|
19 |
|
20 |
Full offload possible on 36GB VRAM with a variable context size (up to 7168 with Q3_K_M, for example)
|
21 |
- Q3_K_M, Q3_K_S, Q3_K_XS, IQ3_XXS SOTA (which is equivalent to a Q3_K_S with more context!)
|
22 |
-
- Lower quality : Q2_K_S
|
23 |
|
24 |
Full offload possible on 24GB VRAM with a decent context size.
|
25 |
- IQ2_XS SOTA
|
|
|
19 |
|
20 |
Full offload possible on 36GB VRAM with a variable context size (up to 7168 with Q3_K_M, for example)
|
21 |
- Q3_K_M, Q3_K_S, Q3_K_XS, IQ3_XXS SOTA (which is equivalent to a Q3_K_S with more context!)
|
22 |
+
- Lower quality : Q2_K (I remade one with iMatrix, which beats hands-down Miqudev's on perplexity), Q2_K_S
|
23 |
|
24 |
Full offload possible on 24GB VRAM with a decent context size.
|
25 |
- IQ2_XS SOTA
|