bartowski commited on
Commit
2478375
1 Parent(s): 08c1171

Llamacpp quants

Browse files
.gitattributes CHANGED
@@ -56,3 +56,5 @@ gemma-2-9b-it-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
56
  gemma-2-9b-it-Q8_0_L.gguf filter=lfs diff=lfs merge=lfs -text
57
  gemma-2-9b-it-f32.gguf filter=lfs diff=lfs merge=lfs -text
58
  gemma-2-9b-it.imatrix filter=lfs diff=lfs merge=lfs -text
 
 
 
56
  gemma-2-9b-it-Q8_0_L.gguf filter=lfs diff=lfs merge=lfs -text
57
  gemma-2-9b-it-f32.gguf filter=lfs diff=lfs merge=lfs -text
58
  gemma-2-9b-it.imatrix filter=lfs diff=lfs merge=lfs -text
59
+ gemma-2-9b-it-Q2_K_L.gguf filter=lfs diff=lfs merge=lfs -text
60
+ gemma-2-9b-it-Q3_K_XL.gguf filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -27,6 +27,8 @@ All quants made using imatrix option with dataset from [here](https://gist.githu
27
  <bos><start_of_turn>user
28
  {prompt}<end_of_turn>
29
  <start_of_turn>model
 
 
30
 
31
  ```
32
 
@@ -40,13 +42,14 @@ Note that this model does not support a System prompt.
40
  | [gemma-2-9b-it-Q8_0.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q8_0.gguf) | Q8_0 | 9.82GB | Extremely high quality, generally unneeded but max available quant. |
41
  | [gemma-2-9b-it-Q6_K_L.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q6_K_L.gguf) | Q6_K_L | 8.67GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. Very high quality, near perfect, *recommended*. |
42
  | [gemma-2-9b-it-Q6_K.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q6_K.gguf) | Q6_K | 7.58GB | Very high quality, near perfect, *recommended*. |
43
- | [gemma-2-9b-it-Q5_K_L.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q5_K_L.gguf) | Q5_K_L | 7.73GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. High quality, *recommended*. |
44
  | [gemma-2-9b-it-Q5_K_M.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q5_K_M.gguf) | Q5_K_M | 6.64GB | High quality, *recommended*. |
45
  | [gemma-2-9b-it-Q5_K_S.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q5_K_S.gguf) | Q5_K_S | 6.48GB | High quality, *recommended*. |
46
  | [gemma-2-9b-it-Q4_K_L.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q4_K_L.gguf) | Q4_K_L | 6.84GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. Good quality, uses about 4.83 bits per weight, *recommended*. |
47
  | [gemma-2-9b-it-Q4_K_M.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q4_K_M.gguf) | Q4_K_M | 5.76GB | Good quality, uses about 4.83 bits per weight, *recommended*. |
48
  | [gemma-2-9b-it-Q4_K_S.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q4_K_S.gguf) | Q4_K_S | 5.47GB | Slightly lower quality with more space savings, *recommended*. |
49
  | [gemma-2-9b-it-IQ4_XS.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-IQ4_XS.gguf) | IQ4_XS | 5.18GB | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
 
50
  | [gemma-2-9b-it-Q3_K_L.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q3_K_L.gguf) | Q3_K_L | 5.13GB | Lower quality but usable, good for low RAM availability. |
51
  | [gemma-2-9b-it-Q3_K_M.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q3_K_M.gguf) | Q3_K_M | 4.76GB | Even lower quality. |
52
  | [gemma-2-9b-it-IQ3_M.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-IQ3_M.gguf) | IQ3_M | 4.49GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
 
27
  <bos><start_of_turn>user
28
  {prompt}<end_of_turn>
29
  <start_of_turn>model
30
+ <end_of_turn>
31
+ <start_of_turn>model
32
 
33
  ```
34
 
 
42
  | [gemma-2-9b-it-Q8_0.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q8_0.gguf) | Q8_0 | 9.82GB | Extremely high quality, generally unneeded but max available quant. |
43
  | [gemma-2-9b-it-Q6_K_L.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q6_K_L.gguf) | Q6_K_L | 8.67GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. Very high quality, near perfect, *recommended*. |
44
  | [gemma-2-9b-it-Q6_K.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q6_K.gguf) | Q6_K | 7.58GB | Very high quality, near perfect, *recommended*. |
45
+ | [gemma-2-9b-it-Q5_K_L.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q5_K_L.gguf) | Q5_K_L | 7.72GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. High quality, *recommended*. |
46
  | [gemma-2-9b-it-Q5_K_M.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q5_K_M.gguf) | Q5_K_M | 6.64GB | High quality, *recommended*. |
47
  | [gemma-2-9b-it-Q5_K_S.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q5_K_S.gguf) | Q5_K_S | 6.48GB | High quality, *recommended*. |
48
  | [gemma-2-9b-it-Q4_K_L.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q4_K_L.gguf) | Q4_K_L | 6.84GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. Good quality, uses about 4.83 bits per weight, *recommended*. |
49
  | [gemma-2-9b-it-Q4_K_M.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q4_K_M.gguf) | Q4_K_M | 5.76GB | Good quality, uses about 4.83 bits per weight, *recommended*. |
50
  | [gemma-2-9b-it-Q4_K_S.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q4_K_S.gguf) | Q4_K_S | 5.47GB | Slightly lower quality with more space savings, *recommended*. |
51
  | [gemma-2-9b-it-IQ4_XS.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-IQ4_XS.gguf) | IQ4_XS | 5.18GB | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
52
+ | [gemma-2-9b-it-Q3_K_XL.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q3_K_XL.gguf) | Q3_K_XL | 6.21GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. Lower quality but usable, good for low RAM availability. |
53
  | [gemma-2-9b-it-Q3_K_L.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q3_K_L.gguf) | Q3_K_L | 5.13GB | Lower quality but usable, good for low RAM availability. |
54
  | [gemma-2-9b-it-Q3_K_M.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q3_K_M.gguf) | Q3_K_M | 4.76GB | Even lower quality. |
55
  | [gemma-2-9b-it-IQ3_M.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-IQ3_M.gguf) | IQ3_M | 4.49GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
gemma-2-9b-it-IQ2_M.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:94b6fe40ee612eb50159ad3e6fc97cc04c88ec7af8a813246d8d8c27c4522a00
3
- size 3434988384
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d15fdae729a79d5be3224dfc91ca1c3e36ca5c1b2b45f58d21a5b6ffd0b4f218
3
+ size 3434669824
gemma-2-9b-it-IQ2_S.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ee80dfba6b4b7aee3528d742526d984496e621640d3c75eb521a36afed2c6010
3
- size 3211805536
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6a8d6e35d9486aeb911874e0191d236989b219b2aff28624cd79f2fa1a0adada
3
+ size 3211486976
gemma-2-9b-it-IQ2_XS.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4f3ae196ff0f2a34927db3ed93451575394fb47e0a843f5d7915f2a9da02e7fd
3
- size 3067700064
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:519b7c3c3dec688972028bbaa3d1ceeb57219d2401b40606817806b192234b88
3
+ size 3067381504
gemma-2-9b-it-IQ3_M.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:744e3ead642fcbde768639cf2532f78da5b8ca80ce1718fed712c0ab38a5ebda
3
- size 4494995808
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:52887543e6b86c6c3b3e0809f93903d2c7c480ed75870b12fee2fd0f47c95747
3
+ size 4494616320
gemma-2-9b-it-IQ3_XS.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:294638961d516cbc520c0dcd6a143fe0226f9969ee483e3e0d89b0f6f3587e4d
3
- size 4145369440
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ae768bece7b9a5fe4c6050ed56c5d18259ebbac3e469d72d339d7d6eccd570f5
3
+ size 4144989952
gemma-2-9b-it-IQ3_XXS.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:79dfb350e10a4c7598b957c00381b7a8459f64e99457d4d05203dc2528683824
3
- size 3797058400
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:01757277da17951371c892938f2f6a9962d95f478dff735a586d4e9fdb3f98f4
3
+ size 3796739840
gemma-2-9b-it-IQ4_XS.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a8817cc2720f51f1130d994fcacd58b8860c88882d10bbffb29ed4b5f62f155b
3
- size 5183410528
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d245892642033bd773aa58de1c56e42665ece1f7f7c0ec44dcfe3a96b7d9651e
3
+ size 5183031040
gemma-2-9b-it-Q2_K.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6c3d884107e4d01bac8316cbe755137beac8988cc9c8a9c8806b61f0c4d8bcc5
3
- size 3805778272
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:18a16f7a5aeec0b980b4de59b5e1360230ae1c8adfd134d1767c9e7e11d98e6e
3
+ size 3805398784
gemma-2-9b-it-Q2_K_L.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:808f8acb636b47d4a4f3e98c1fff54a71eb6fce8a089f81c3fe2fe393fb78617
3
+ size 4887766784
gemma-2-9b-it-Q3_K_L.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d97e40aef60b072b15d9bdd41b171f141f46b45a18eaee257d9bf91daea52e1f
3
- size 5132833120
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fc77fafad18312c3f3d0d316e2edefd28dcd539cba10cc1fbe7f0dc3d53dae6d
3
+ size 5132453632
gemma-2-9b-it-Q3_K_M.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:066d5cd0c5ad7a8d2b7407fb53f56858e6736c8248a06111b5f65afcbc709c8f
3
- size 4762161504
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9ff6aebb809a52bf5560d23c4fda7e96ee9a3a75cf7ab0e20ab3089017020645
3
+ size 4761782016
gemma-2-9b-it-Q3_K_S.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6a0dc0e74df5cd76fb134249f4858bd4610049111482daa6dc1d8479ff283f21
3
- size 4338045280
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1866bcc45b83bebacef1cf9daf09bc94036a2705afbf8eaf9369f9bc6006209e
3
+ size 4337665792
gemma-2-9b-it-Q3_K_XL.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e3c84c98a66aae4b79d92903f1cab526a158315c26a2851cdb6a6174720300a7
3
+ size 6214821632
gemma-2-9b-it-Q4_K_L.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:05dfbfc607fce0d039cac53098c387eb66c9c17587eb8bbf8b400c4769ae0252
3
- size 6844347232
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2fee521f3b0f96aa358c0da9d5a8ac18d1ee81426059599ab7b15e9f06c3dc49
3
+ size 6843426560
gemma-2-9b-it-Q4_K_M.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0874bf61be2e4b3d0a4a75e58fbd442dc410745d513c1e1e5de0b54ae33e65db
3
- size 5761438048
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5375972196fae34c1a767bbeba93938d86abb39f2f91ea5453efa36ead6569f1
3
+ size 5761058560
gemma-2-9b-it-Q4_K_S.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2ae8828f11ad498fa6fb0c31ccbd5f6ec0bc3b579613f164b8bebc663f9d1763
3
- size 5479305568
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:225432bb66c374da3a94ed5d5c1ff0e6b80d742a8a09281a58e2fff8e9efa72e
3
+ size 5478926080
gemma-2-9b-it-Q5_K_L.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:381bc6455ee6bd33dfa3987d001678e390019e214a6f3e7fb17e22853a0102da
3
- size 7730656096
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:807d53fe895f460ccdb7817557965852a73886df2026c101457de9c7e3a038ad
3
+ size 7729735424
gemma-2-9b-it-Q5_K_M.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fd2ef3823778e3c138aebb5892fcc4587313549ec3c21b84103090ce4b7617b4
3
- size 6647746912
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:78f480cb36e05fedbae67e097840cd71999dde890d57287f4205a331a0d5cefe
3
+ size 6647367424
gemma-2-9b-it-Q5_K_S.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4ba2748d33407c999dc1c2fe6845e39eb686792cb5e42d5d1bc20599cfb2bdb7
3
- size 6483972448
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:975f87d0482f74adedeadddeabb68d87b6202da0e7e237687215c6c69b43b91b
3
+ size 6483592960
gemma-2-9b-it-Q6_K.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3f38ef3fe66173a34e05b7f356fa0880b3941e64c94618623f4c243e1a4fde12
3
- size 7589450080
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:edc2b9f3f811cb78101d618a2db360ca374584fbdb8540afae869a6fffaa6516
3
+ size 7589070592
gemma-2-9b-it-Q6_K_L.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d69cac05ff85f04207610e19def2e741d0fda43c19e0bf64cc7dfef9f005a2ef
3
- size 8672359264
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:af575649b23922300468189a26e381951b298bdb18046f7aeb4fcc63bc30a5d6
3
+ size 8671438592
gemma-2-9b-it-Q8_0.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:68ef14eb96fb6ab0dfb04dc9d73e4b5ff42fda0011110a160881d20b3d84b594
3
- size 9827640160
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9f9a9de67bec3d6e8277c1964c278aa419c9ed7533cefe6595a8ee4e9c568d01
3
+ size 9827149568
gemma-2-9b-it-Q8_0_L.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0c2c8fe89751b1be794ce20ae92a764b3fe7fdfad735e5402650b5e40ad5788c
3
- size 10688230240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:61707731042388bfcf35eadbddfc0a812df783d41bf541de231ba5cda4775347
3
+ size 10687309568
gemma-2-9b-it-f32.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4f7d9c68f33c338a0d98faf6bad67cb13d39a5d4f3c87965bff4d62620d27d70
3
- size 36974719488
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fbf05a8f90685a86b8b92c8de0b5777f3346a976f496123f36da8585c3177362
3
+ size 36972881408
gemma-2-9b-it.imatrix CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3ebc87750f8830146ae668032ca7e319bfbf89e4276c1a514aeee1be9e6addfd
3
  size 6116901
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8a2ec42f9516ace90f9ecb98781eef3db3b63040319ed9192ea3cf8782ebc454
3
  size 6116901