Lewdiculous commited on
Commit
16e271b
1 Parent(s): 183ea5f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -16,6 +16,10 @@ GGUF-IQ-Imatrix quants for [jeiku/Chaos_RP_l3_8B](https://huggingface.co/jeiku/C
16
  > **Updated!**
17
  > These quants have been redone with the fixes from [llama.cpp/pull/6920](https://github.com/ggerganov/llama.cpp/pull/6920) in mind.
18
 
 
 
 
 
19
  > [!WARNING]
20
  > Recommended presets [here](https://huggingface.co/Lewdiculous/Model-Requests/tree/main/data/presets/cope-llama-3-0.1) or [here](https://huggingface.co/Virt-io/SillyTavern-Presets). <br>
21
  > Use the latest version of KoboldCpp. **Use the provided presets.** <br>
 
16
  > **Updated!**
17
  > These quants have been redone with the fixes from [llama.cpp/pull/6920](https://github.com/ggerganov/llama.cpp/pull/6920) in mind.
18
 
19
+ > [!NOTE]
20
+ > **Quant:** <br>
21
+ > For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** quant for up to 12288 context sizes.
22
+
23
  > [!WARNING]
24
  > Recommended presets [here](https://huggingface.co/Lewdiculous/Model-Requests/tree/main/data/presets/cope-llama-3-0.1) or [here](https://huggingface.co/Virt-io/SillyTavern-Presets). <br>
25
  > Use the latest version of KoboldCpp. **Use the provided presets.** <br>