Update README.md
Browse files
README.md
CHANGED
@@ -6,9 +6,11 @@ tags:
|
|
6 |
---
|
7 |
|
8 |
# vicuna-13b-4bit
|
9 |
-
Converted `vicuna-13b` to GPTQ 4bit using `true-sequentual` and `groupsize 128` in `safetensors` for best possible model performance.
|
10 |
|
11 |
-
https://
|
|
|
|
|
12 |
|
13 |
# Update 2023-04-03
|
14 |
Recent GPTQ commits have introduced breaking changes to model loading and you should use commit `a6f363e3f93b9fb5c26064b5ac7ed58d22e3f773` in the `cuda` branch.
|
@@ -27,6 +29,7 @@ This creates and switches to a `cuda-stable` branch to continue using the quanti
|
|
27 |
Since this is instruction tuned, for best results, use the following format for inference (note that the instruction format is different from Alpaca):
|
28 |
```
|
29 |
### Human: your-prompt
|
|
|
30 |
```
|
31 |
|
32 |
If you want deterministic results, turn off sampling. You can turn it off in the webui by unchecking `do_sample`.
|
|
|
6 |
---
|
7 |
|
8 |
# vicuna-13b-4bit
|
9 |
+
Converted `vicuna-13b` to GPTQ 4bit using `true-sequentual` and `groupsize 128` in `safetensors` for best possible model performance.
|
10 |
|
11 |
+
Vicuna is a high coherence model based on Llama that is comparable to ChatGPT. Read more here https://vicuna.lmsys.org/
|
12 |
+
|
13 |
+
GPTQ - https://github.com/qwopqwop200/GPTQ-for-LLaMa
|
14 |
|
15 |
# Update 2023-04-03
|
16 |
Recent GPTQ commits have introduced breaking changes to model loading and you should use commit `a6f363e3f93b9fb5c26064b5ac7ed58d22e3f773` in the `cuda` branch.
|
|
|
29 |
Since this is instruction tuned, for best results, use the following format for inference (note that the instruction format is different from Alpaca):
|
30 |
```
|
31 |
### Human: your-prompt
|
32 |
+
### Assistant:
|
33 |
```
|
34 |
|
35 |
If you want deterministic results, turn off sampling. You can turn it off in the webui by unchecking `do_sample`.
|