Update README.md
Browse files
README.md
CHANGED
@@ -18,11 +18,26 @@ That make me wonder about the future, when we'll get Miqu 70b models properly fi
|
|
18 |
|
19 |
Available quants :
|
20 |
|
21 |
-
|
22 |
|
23 |
-
|
24 |
|
25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
26 |
|
27 |
---
|
28 |
|
|
|
18 |
|
19 |
Available quants :
|
20 |
|
21 |
+
Full offload possible on 48GB VRAM with a huge context size :
|
22 |
|
23 |
+
Q8_0
|
24 |
|
25 |
+
Full offload possible on 36 GB VRAM with a huge context size :
|
26 |
+
|
27 |
+
Q5_K_S
|
28 |
+
|
29 |
+
Full offload possible on 24GB VRAM with a big to huge context size (from 12288 with Q4_K_M, for example)
|
30 |
+
|
31 |
+
Q4_K_M, Q4_K_S, Q3_K_M
|
32 |
+
|
33 |
+
Full offload possible on 16GB VRAM with a decent context size
|
34 |
+
|
35 |
+
IQ3_XXS SOTA otw (which is equivalent to a Q3_K_S with more context!), Q2_K, Q2_K_S otw
|
36 |
+
|
37 |
+
Full offload possible on 12GB VRAM with a decent context size.
|
38 |
+
|
39 |
+
IQ2_XS SOTA otw
|
40 |
+
Lower quality : IQ2_XXS SOTA otw
|
41 |
|
42 |
---
|
43 |
|