Llamacpp quants
Browse files- .gitattributes +16 -0
- README.md +48 -0
- gemma-1.1-2b-it-IQ3_M.gguf +3 -0
- gemma-1.1-2b-it-IQ3_S.gguf +3 -0
- gemma-1.1-2b-it-IQ4_NL.gguf +3 -0
- gemma-1.1-2b-it-IQ4_XS.gguf +3 -0
- gemma-1.1-2b-it-Q2_K.gguf +3 -0
- gemma-1.1-2b-it-Q3_K_L.gguf +3 -0
- gemma-1.1-2b-it-Q3_K_M.gguf +3 -0
- gemma-1.1-2b-it-Q3_K_S.gguf +3 -0
- gemma-1.1-2b-it-Q4_0.gguf +3 -0
- gemma-1.1-2b-it-Q4_K_M.gguf +3 -0
- gemma-1.1-2b-it-Q4_K_S.gguf +3 -0
- gemma-1.1-2b-it-Q5_0.gguf +3 -0
- gemma-1.1-2b-it-Q5_K_M.gguf +3 -0
- gemma-1.1-2b-it-Q5_K_S.gguf +3 -0
- gemma-1.1-2b-it-Q6_K.gguf +3 -0
- gemma-1.1-2b-it-Q8_0.gguf +3 -0
.gitattributes
CHANGED
@@ -33,3 +33,19 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
gemma-1.1-2b-it-IQ3_M.gguf filter=lfs diff=lfs merge=lfs -text
|
37 |
+
gemma-1.1-2b-it-IQ3_S.gguf filter=lfs diff=lfs merge=lfs -text
|
38 |
+
gemma-1.1-2b-it-IQ4_NL.gguf filter=lfs diff=lfs merge=lfs -text
|
39 |
+
gemma-1.1-2b-it-IQ4_XS.gguf filter=lfs diff=lfs merge=lfs -text
|
40 |
+
gemma-1.1-2b-it-Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
|
41 |
+
gemma-1.1-2b-it-Q3_K_L.gguf filter=lfs diff=lfs merge=lfs -text
|
42 |
+
gemma-1.1-2b-it-Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
43 |
+
gemma-1.1-2b-it-Q3_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
44 |
+
gemma-1.1-2b-it-Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
|
45 |
+
gemma-1.1-2b-it-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
46 |
+
gemma-1.1-2b-it-Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
47 |
+
gemma-1.1-2b-it-Q5_0.gguf filter=lfs diff=lfs merge=lfs -text
|
48 |
+
gemma-1.1-2b-it-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
49 |
+
gemma-1.1-2b-it-Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
50 |
+
gemma-1.1-2b-it-Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
|
51 |
+
gemma-1.1-2b-it-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
@@ -0,0 +1,48 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
library_name: transformers
|
3 |
+
widget:
|
4 |
+
- messages:
|
5 |
+
- role: user
|
6 |
+
content: How does the brain work?
|
7 |
+
inference:
|
8 |
+
parameters:
|
9 |
+
max_new_tokens: 200
|
10 |
+
extra_gated_heading: Access Gemma on Hugging Face
|
11 |
+
extra_gated_prompt: >-
|
12 |
+
To access Gemma on Hugging Face, you’re required to review and agree to
|
13 |
+
Google’s usage license. To do this, please ensure you’re logged-in to Hugging
|
14 |
+
Face and click below. Requests are processed immediately.
|
15 |
+
extra_gated_button_content: Acknowledge license
|
16 |
+
license: gemma
|
17 |
+
quantized_by: bartowski
|
18 |
+
pipeline_tag: text-generation
|
19 |
+
---
|
20 |
+
|
21 |
+
## Llamacpp Quantizations of gemma-1.1-2b-it
|
22 |
+
|
23 |
+
Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b2589">b2589</a> for quantization.
|
24 |
+
|
25 |
+
Original model: https://huggingface.co/google/gemma-1.1-2b-it
|
26 |
+
|
27 |
+
Download a file (not the whole branch) from below:
|
28 |
+
|
29 |
+
| Filename | Quant type | File Size | Description |
|
30 |
+
| -------- | ---------- | --------- | ----------- |
|
31 |
+
| [gemma-1.1-2b-it-Q8_0.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q8_0.gguf) | Q8_0 | 2.66GB | Extremely high quality, generally unneeded but max available quant. |
|
32 |
+
| [gemma-1.1-2b-it-Q6_K.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q6_K.gguf) | Q6_K | 2.06GB | Very high quality, near perfect, *recommended*. |
|
33 |
+
| [gemma-1.1-2b-it-Q5_K_M.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q5_K_M.gguf) | Q5_K_M | 1.83GB | High quality, *recommended*. |
|
34 |
+
| [gemma-1.1-2b-it-Q5_K_S.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q5_K_S.gguf) | Q5_K_S | 1.79GB | High quality, *recommended*. |
|
35 |
+
| [gemma-1.1-2b-it-Q5_0.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q5_0.gguf) | Q5_0 | 1.79GB | High quality, older format, generally not recommended. |
|
36 |
+
| [gemma-1.1-2b-it-Q4_K_M.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q4_K_M.gguf) | Q4_K_M | 1.63GB | Good quality, uses about 4.83 bits per weight, *recommended*. |
|
37 |
+
| [gemma-1.1-2b-it-Q4_K_S.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q4_K_S.gguf) | Q4_K_S | 1.55GB | Slightly lower quality with small space savings. |
|
38 |
+
| [gemma-1.1-2b-it-IQ4_NL.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-IQ4_NL.gguf) | IQ4_NL | 1.56GB | Decent quality, similar to Q4_K_S, new method of quanting, *recommended*. |
|
39 |
+
| [gemma-1.1-2b-it-IQ4_XS.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-IQ4_XS.gguf) | IQ4_XS | 1.50GB | Decent quality, new method with similar performance to Q4. |
|
40 |
+
| [gemma-1.1-2b-it-Q4_0.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q4_0.gguf) | Q4_0 | 1.55GB | Decent quality, older format, generally not recommended. |
|
41 |
+
| [gemma-1.1-2b-it-Q3_K_L.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q3_K_L.gguf) | Q3_K_L | 1.46GB | Lower quality but usable, good for low RAM availability. |
|
42 |
+
| [gemma-1.1-2b-it-Q3_K_M.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q3_K_M.gguf) | Q3_K_M | 1.38GB | Even lower quality. |
|
43 |
+
| [gemma-1.1-2b-it-IQ3_M.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-IQ3_M.gguf) | IQ3_M | 1.30GB | Medium-low quality, new method with decent performance. |
|
44 |
+
| [gemma-1.1-2b-it-IQ3_S.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-IQ3_S.gguf) | IQ3_S | 1.28GB | Lower quality, new method with decent performance, recommended over Q3 quants. |
|
45 |
+
| [gemma-1.1-2b-it-Q3_K_S.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q3_K_S.gguf) | Q3_K_S | 1.28GB | Low quality, not recommended. |
|
46 |
+
| [gemma-1.1-2b-it-Q2_K.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q2_K.gguf) | Q2_K | 1.15GB | Extremely low quality, *not* recommended. |
|
47 |
+
|
48 |
+
Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski
|
gemma-1.1-2b-it-IQ3_M.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:00a5fbfc4e159681da16464f6faf41b73d1d4589d469d5515289a82ae4d408aa
|
3 |
+
size 1308174048
|
gemma-1.1-2b-it-IQ3_S.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8d3de552d7fb8d864ef2aed4eaadbc49d1213b16490cfc0cc06b4ae9e5296306
|
3 |
+
size 1289234144
|
gemma-1.1-2b-it-IQ4_NL.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9b86787843372c84d80be69b526e1297bfd3927d3f3caf545d37827002c45542
|
3 |
+
size 1560757984
|
gemma-1.1-2b-it-IQ4_XS.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c57b8d9ae9a42389c2c272dcd25f5e503804e07f663448e2c793663992f4239a
|
3 |
+
size 1501218528
|
gemma-1.1-2b-it-Q2_K.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fd565dc7ce9f11dc840895c02506dcf5a7f7f595d7837d5eb7816b9fdfa7679d
|
3 |
+
size 1157924576
|
gemma-1.1-2b-it-Q3_K_L.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3ee84ba9d80c8fe63ecbb92882dbfe41b6ae65950c63e410962f93a6a9f2570a
|
3 |
+
size 1465591520
|
gemma-1.1-2b-it-Q3_K_M.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d8aaa0dab9fae50249b5bed1ee880669fc0199d551ae757e47e9a52a49810e6d
|
3 |
+
size 1383802592
|
gemma-1.1-2b-it-Q3_K_S.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:cf52d95e1f23e8a1f29a85a4bc86accc6ad834e03df7eb2e5a432355db60b1dd
|
3 |
+
size 1287980768
|
gemma-1.1-2b-it-Q4_0.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5ec0c6be4f9ccd657518e9304c1912c872939884748f0927a23b0d542aa4043c
|
3 |
+
size 1551189728
|
gemma-1.1-2b-it-Q4_K_M.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:cc2118e1d780fa33582738d8c99223d62c8734b06ef65076c01618d484d081d4
|
3 |
+
size 1630263008
|
gemma-1.1-2b-it-Q4_K_S.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:381472785f7f3a4a72c1f76b3351d9eb2686836b8587d389e7d2afb82f6f48f4
|
3 |
+
size 1559840480
|
gemma-1.1-2b-it-Q5_0.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:74b19076a50159b0af56ce92270ce53430d2ec5e00e1cddc1fab3c88683f09f7
|
3 |
+
size 1798915808
|
gemma-1.1-2b-it-Q5_K_M.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9c19111998d075a9e9f2241c0ebfdd331be0e74e68633637d8a89af832fb3b4e
|
3 |
+
size 1839650528
|
gemma-1.1-2b-it-Q5_K_S.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b2eefbefa4ee3567eb7cbe891caa6550bae83c6c1d6f1a8c4cfe8aef42f09c52
|
3 |
+
size 1798915808
|
gemma-1.1-2b-it-Q6_K.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6c1e783072c32fdd56a60eb3dff33dac0126e8657b9d7bd558fad7dbeceefad4
|
3 |
+
size 2062124768
|
gemma-1.1-2b-it-Q8_0.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:769d716580c94c874864da0991e54a53d27ead0760b419e38a2d56bcfc4d4f8d
|
3 |
+
size 2669070048
|