Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ tags:
|
|
8 |
|
9 |
# vicuna-13b-4bit
|
10 |
Converted `vicuna-13b` to GPTQ 4bit using `true-sequentual` and `groupsize 128` in `safetensors` for best possible model performance.
|
11 |
-
This does **not** support `llama.cpp` or any other cpp implemetations, only `cuda` or `triton`.
|
12 |
|
13 |
Vicuna is a high coherence model based on Llama that is comparable to ChatGPT. Read more here https://vicuna.lmsys.org/
|
14 |
|
|
|
8 |
|
9 |
# vicuna-13b-4bit
|
10 |
Converted `vicuna-13b` to GPTQ 4bit using `true-sequentual` and `groupsize 128` in `safetensors` for best possible model performance.
|
11 |
+
This does **not** support `llama.cpp` or any other cpp implemetations, only `cuda` or `triton`. These implementations require a different format to use.
|
12 |
|
13 |
Vicuna is a high coherence model based on Llama that is comparable to ChatGPT. Read more here https://vicuna.lmsys.org/
|
14 |
|