Ichsan2895
/

Merak-7B-v3-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Ichsan2895 commited on Oct 7, 2023

Commit

78dde89

•

1 Parent(s): 728e108

Update README.md

Files changed (1) hide show

README.md +15 -0

README.md CHANGED Viewed

@@ -44,6 +44,21 @@ These quantised GGUFv2 files are compatible with llama.cpp from August 27th onwa
 They are also compatible with many third party UIs and libraries - please see the list at the top of this README.
 ### Explanation of quantisation methods
 <details>
   <summary>Click to see details</summary>

 They are also compatible with many third party UIs and libraries - please see the list at the top of this README.
+### Provided files
+| Name | Quant method | Bits | Size | Use case |
+| ---- | ---- | ---- | ---- | ---- | ----- |
+| [Merak-7B-v3-model-Q2_K.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v3-GGUF/blob/main/Merak-7B-v3-model-q2_k.gguf) | Q2_K | 2 | 3.08 GB| smallest, significant quality loss - not recommended for most purposes |
+| [Merak-7B-v3-model-Q3_K_M.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v3-GGUF/blob/main/Merak-7B-v3-model-q3_k_m.gguf) | Q3_K_M | 3 | 3.52 GB| very small, high quality loss |
+| [Merak-7B-v3-model-Q4_0.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v3-GGUF/blob/main/Merak-7B-v3-model-q4_0.gguf) | Q4_0 | 4 | 4.11 GB| legacy; small, very high quality loss - prefer using Q3_K_M |
+| [Merak-7B-v3-model-Q4_K_M.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v3-GGUF/blob/main/Merak-7B-v3-model-q4_k_m.gguf) | Q4_K_M | 4 | 4.37 GB| medium, balanced quality - recommended |
+| [Merak-7B-v3-model-Q5_0.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v3-GGUF/blob/main/Merak-7B-v3-model-q5_0.gguf) | Q5_0 | 5 | 5.00 GB| legacy; medium, balanced quality - prefer using Q4_K_M |
+| [Merak-7B-v3-model-Q5_K_M.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v3-GGUF/blob/main/Merak-7B-v3-model-q5_k_m.gguf) | Q5_K_M | 5 | 5.13 GB| large, very low quality loss - recommended |
+| [Merak-7B-v3-model-Q6_K.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v3-GGUF/blob/main/Merak-7B-v3-model-q6_k.gguf) | Q6_K | 6 | 5.94 GB| very large, extremely low quality loss |
+| [Merak-7B-v3-model-Q8_0.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v3-GGUF/blob/main/Merak-7B-v3-model-q8_0.gguf) | Q8_0 | 8 | 7.70 GB| very large, extremely low quality loss - not recommended |
+**Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
 ### Explanation of quantisation methods
 <details>
   <summary>Click to see details</summary>