bartowski
/

Meta-Llama-3.1-8B-Instruct-GGUF

@@ -235,9 +235,9 @@ Today Date: 26 Jul 2024
 | [Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf) | Q4_K_M | 4.92GB | false | Good quality, default size for must use cases, *recommended*. |
 | [Meta-Llama-3.1-8B-Instruct-Q3_K_XL.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q3_K_XL.gguf) | Q3_K_XL | 4.78GB | false | Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
 | [Meta-Llama-3.1-8B-Instruct-Q4_K_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_K_S.gguf) | Q4_K_S | 4.69GB | false | Slightly lower quality with more space savings, *recommended*. |
-| [Meta-Llama-3.1-8B-Instruct-Q4_0_8_8.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_0_8_8.gguf) | Q4_0_8_8 | 4.67GB | false | Optimized for ARM and AVX inference. Requires 'sve' support for ARM (see details below). Don't use on Mac. |
-| [Meta-Llama-3.1-8B-Instruct-Q4_0_4_8.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_0_4_8.gguf) | Q4_0_4_8 | 4.67GB | false | Optimized for ARM inference. Requires 'i8mm' support (see details below). Don't use on Mac. |
-| [Meta-Llama-3.1-8B-Instruct-Q4_0_4_4.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_0_4_4.gguf) | Q4_0_4_4 | 4.67GB | false | Optimized for ARM inference. Should work well on all ARM chips, not for use with GPUs. Don't use on Mac. |
 | [Meta-Llama-3.1-8B-Instruct-IQ4_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-IQ4_XS.gguf) | IQ4_XS | 4.45GB | false | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
 | [Meta-Llama-3.1-8B-Instruct-Q3_K_L.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q3_K_L.gguf) | Q3_K_L | 4.32GB | false | Lower quality but usable, good for low RAM availability. |
 | [Meta-Llama-3.1-8B-Instruct-Q3_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q3_K_M.gguf) | Q3_K_M | 4.02GB | false | Low quality. |

 | [Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf) | Q4_K_M | 4.92GB | false | Good quality, default size for must use cases, *recommended*. |
 | [Meta-Llama-3.1-8B-Instruct-Q3_K_XL.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q3_K_XL.gguf) | Q3_K_XL | 4.78GB | false | Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
 | [Meta-Llama-3.1-8B-Instruct-Q4_K_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_K_S.gguf) | Q4_K_S | 4.69GB | false | Slightly lower quality with more space savings, *recommended*. |
+| [Meta-Llama-3.1-8B-Instruct-Q4_0_8_8.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_0_8_8.gguf) | Q4_0_8_8 | 4.66GB | false | Optimized for ARM and AVX inference. Requires 'sve' support for ARM (see details below). Don't use on Mac. |
+| [Meta-Llama-3.1-8B-Instruct-Q4_0_4_8.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_0_4_8.gguf) | Q4_0_4_8 | 4.66GB | false | Optimized for ARM inference. Requires 'i8mm' support (see details below). Don't use on Mac. |
+| [Meta-Llama-3.1-8B-Instruct-Q4_0_4_4.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_0_4_4.gguf) | Q4_0_4_4 | 4.66GB | false | Optimized for ARM inference. Should work well on all ARM chips, not for use with GPUs. Don't use on Mac. |
 | [Meta-Llama-3.1-8B-Instruct-IQ4_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-IQ4_XS.gguf) | IQ4_XS | 4.45GB | false | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
 | [Meta-Llama-3.1-8B-Instruct-Q3_K_L.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q3_K_L.gguf) | Q3_K_L | 4.32GB | false | Lower quality but usable, good for low RAM availability. |
 | [Meta-Llama-3.1-8B-Instruct-Q3_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q3_K_M.gguf) | Q3_K_M | 4.02GB | false | Low quality. |