Nexesenex commited on
Commit
368164c
1 Parent(s): ff0ccc2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -3
README.md CHANGED
@@ -18,11 +18,26 @@ That make me wonder about the future, when we'll get Miqu 70b models properly fi
18
 
19
  Available quants :
20
 
21
- Q8_0, Q5_K_S, Q4_K_M, Q4_K_S, Q3_K_M, Q2_K
22
 
23
- To come in the week :
24
 
25
- IQ3_XXS, Q2_K_S, IQ2_XS, IQ2_XXS
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
  ---
28
 
 
18
 
19
  Available quants :
20
 
21
+ Full offload possible on 48GB VRAM with a huge context size :
22
 
23
+ Q8_0
24
 
25
+ Full offload possible on 36 GB VRAM with a huge context size :
26
+
27
+ Q5_K_S
28
+
29
+ Full offload possible on 24GB VRAM with a big to huge context size (from 12288 with Q4_K_M, for example)
30
+
31
+ Q4_K_M, Q4_K_S, Q3_K_M
32
+
33
+ Full offload possible on 16GB VRAM with a decent context size
34
+
35
+ IQ3_XXS SOTA otw (which is equivalent to a Q3_K_S with more context!), Q2_K, Q2_K_S otw
36
+
37
+ Full offload possible on 12GB VRAM with a decent context size.
38
+
39
+ IQ2_XS SOTA otw
40
+ Lower quality : IQ2_XXS SOTA otw
41
 
42
  ---
43