Lewdiculous
/

Aura_v2_7B-GGUF-IQ-Imatrix

Model card Files Files and versions Community

Lewdiculous commited on Apr 16, 2024

Commit

a3e4098

•

1 Parent(s): 3419502

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -38,7 +38,7 @@ In this repository you can find **GGUF-IQ-Imatrix** quants for [ResplendentAI/Au
 </summary>
 *Assuming a context size of 8192 for simplicity and 1GB of Operating System VRAM overhead with some safety margin to avoid overflowing buffers...* <br> <br>
-**For 11-12GB VRAM:** <br> A GPU with **12GB** of VRAM capacity can comfortably use the **Q6_K-imat** quant option and run it at good speeds. <br> This is the same with or without using #vision capabilities. <br> <br>
 **For 8GB VRAM:** <br> If not using #vision, for GPUs with **8GB** of VRAM capacity the **Q5_K_M-imat** quant option will fit comfortably and should run at good speeds. <br> If **you are** also using #vision from this model opt for the **Q4_K_M-imat** quant option to avoid filling the buffers and potential slowdowns. <br><br>
 **For 6GB VRAM:** <br> If not using #vision, for GPUs with **6GB** of VRAM capacity the **IQ3_M-imat** quant option should fit comfortably to run at good speeds. <br> If **you are** also using #vision from this model opt for the **IQ3_XXS-imat** quant option. <br><br>

 </summary>
 *Assuming a context size of 8192 for simplicity and 1GB of Operating System VRAM overhead with some safety margin to avoid overflowing buffers...* <br> <br>
+**For 11-12GB VRAM:** <br> A GPU with **11-12GB** of VRAM capacity can comfortably use the **Q6_K-imat** quant option and run it at good speeds. <br> This is the same with or without using #vision capabilities. <br> <br>
 **For 8GB VRAM:** <br> If not using #vision, for GPUs with **8GB** of VRAM capacity the **Q5_K_M-imat** quant option will fit comfortably and should run at good speeds. <br> If **you are** also using #vision from this model opt for the **Q4_K_M-imat** quant option to avoid filling the buffers and potential slowdowns. <br><br>
 **For 6GB VRAM:** <br> If not using #vision, for GPUs with **6GB** of VRAM capacity the **IQ3_M-imat** quant option should fit comfortably to run at good speeds. <br> If **you are** also using #vision from this model opt for the **IQ3_XXS-imat** quant option. <br><br>