Lewdiculous
commited on
Commit
•
16e271b
1
Parent(s):
183ea5f
Update README.md
Browse files
README.md
CHANGED
@@ -16,6 +16,10 @@ GGUF-IQ-Imatrix quants for [jeiku/Chaos_RP_l3_8B](https://huggingface.co/jeiku/C
|
|
16 |
> **Updated!**
|
17 |
> These quants have been redone with the fixes from [llama.cpp/pull/6920](https://github.com/ggerganov/llama.cpp/pull/6920) in mind.
|
18 |
|
|
|
|
|
|
|
|
|
19 |
> [!WARNING]
|
20 |
> Recommended presets [here](https://huggingface.co/Lewdiculous/Model-Requests/tree/main/data/presets/cope-llama-3-0.1) or [here](https://huggingface.co/Virt-io/SillyTavern-Presets). <br>
|
21 |
> Use the latest version of KoboldCpp. **Use the provided presets.** <br>
|
|
|
16 |
> **Updated!**
|
17 |
> These quants have been redone with the fixes from [llama.cpp/pull/6920](https://github.com/ggerganov/llama.cpp/pull/6920) in mind.
|
18 |
|
19 |
+
> [!NOTE]
|
20 |
+
> **Quant:** <br>
|
21 |
+
> For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** quant for up to 12288 context sizes.
|
22 |
+
|
23 |
> [!WARNING]
|
24 |
> Recommended presets [here](https://huggingface.co/Lewdiculous/Model-Requests/tree/main/data/presets/cope-llama-3-0.1) or [here](https://huggingface.co/Virt-io/SillyTavern-Presets). <br>
|
25 |
> Use the latest version of KoboldCpp. **Use the provided presets.** <br>
|