devingulliver commited on
Commit
29d859a
1 Parent(s): 66ff57d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md CHANGED
@@ -1,3 +1,34 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ base_model:
4
+ - state-spaces/mamba-130m
5
+ - state-spaces/mamba-370m
6
+ - state-spaces/mamba-790m
7
+ - state-spaces/mamba-1.4b
8
+ - state-spaces/mamba-2.8b
9
  ---
10
+
11
+ # Mamba GGUF
12
+
13
+ These are the Mamba base models, converted to GGUF for use with [llama.cpp](https://github.com/ggerganov/llama.cpp), in a variety of precisions (2, 3, 4, 5, 6, 8, 16, and 32-bit).
14
+
15
+ Please click "Files and versions" at the top of the page to choose your desired model size, and then click the "`📦LFS ` ` ↓`" button next to your desired quantization.
16
+
17
+ Here is a table adapted from [TheBloke](https://huggingface.co/TheBloke) explaining the various precisions:
18
+
19
+ | Quant method | Use case |
20
+ | ---- | ---- |
21
+ | Q2_K | significant quality loss - not recommended for most purposes |
22
+ | Q3_K_S | very small, high quality loss |
23
+ | Q3_K_M | very small, high quality loss |
24
+ | Q3_K_L | small, substantial quality loss |
25
+ | Q4_0 | legacy; small, very high quality loss - prefer using Q3_K_M |
26
+ | Q4_K_S | small, greater quality loss |
27
+ | Q4_K_M | medium, balanced quality - recommended |
28
+ | Q5_0 | legacy; medium, balanced quality - prefer using Q4_K_M |
29
+ | Q5_K_S | large, low quality loss - recommended |
30
+ | Q5_K_M | large, very low quality loss - recommended |
31
+ | Q6_K | very large, extremely low quality loss |
32
+ | Q8_0 | very large, extremely low quality loss - not recommended |
33
+ | F16 | half precision - almost identical to the original |
34
+ | F32 | original precision - recommended by the Mamba authors |