Update README.md
Browse files
README.md
CHANGED
@@ -3,7 +3,7 @@ license: other
|
|
3 |
inference: false
|
4 |
---
|
5 |
|
6 |
-
# Alpaca LoRA GPTQ 4bit
|
7 |
|
8 |
This is a [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa) [changsung's alpaca-lora-65B](https://huggingface.co/chansung/alpaca-lora-65b)
|
9 |
|
@@ -15,6 +15,8 @@ I can't guarantee that the two 128g files will work in only 40GB of VRAM.
|
|
15 |
|
16 |
I haven't specifically tested VRAM requirements yet but will aim to do so at some point. If you have any experiences to share, please do so in the comments.
|
17 |
|
|
|
|
|
18 |
## GIBBERISH OUTPUT IN `text-generation-webui`?
|
19 |
|
20 |
Please read the Provided Files section below. You should use `alpaca-lora-65B-GPTQ-4bit-128g.no-act-order.safetensors` unless you are able to use the latest Triton branch of GPTQ-for-LLaMa.
|
|
|
3 |
inference: false
|
4 |
---
|
5 |
|
6 |
+
# Alpaca LoRA 65B GPTQ 4bit
|
7 |
|
8 |
This is a [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa) [changsung's alpaca-lora-65B](https://huggingface.co/chansung/alpaca-lora-65b)
|
9 |
|
|
|
15 |
|
16 |
I haven't specifically tested VRAM requirements yet but will aim to do so at some point. If you have any experiences to share, please do so in the comments.
|
17 |
|
18 |
+
If you want to try CPU inference instead, you can try my GGML repo instead: [TheBloke/alpaca-lora-65B-GGML](https://huggingface.co/TheBloke/alpaca-lora-65B-GGML).
|
19 |
+
|
20 |
## GIBBERISH OUTPUT IN `text-generation-webui`?
|
21 |
|
22 |
Please read the Provided Files section below. You should use `alpaca-lora-65B-GPTQ-4bit-128g.no-act-order.safetensors` unless you are able to use the latest Triton branch of GPTQ-for-LLaMa.
|