Update README.md
Browse files
README.md
CHANGED
@@ -3,23 +3,26 @@ license: apache-2.0
|
|
3 |
inference: false
|
4 |
tags:
|
5 |
- auto-gptq
|
|
|
6 |
---
|
7 |
|
8 |
|
9 |
-
# redpajama gptq
|
10 |
|
11 |
<a href="https://colab.research.google.com/gist/pszemraj/86d2e8485df182302646ed2c5a637059/inference-with-redpajama-incite-chat-3b-v1-gptq-4bit-128g.ipynb">
|
12 |
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
13 |
</a>
|
14 |
|
15 |
-
A GPTQ quantization of the [RedPajama-INCITE-Chat-3B-v1](https://huggingface.co/togethercomputer/RedPajama-INCITE-Chat-3B-v1) via auto-gptq.
|
16 |
|
17 |
|
18 |
## Usage
|
19 |
|
20 |
|
21 |
> Note that you cannot load directly from the hub with `auto_gptq` yet - if needed you can use [this function](https://gist.github.com/pszemraj/8368cba3400bda6879e521a55d2346d0) to download using the repo name.
|
22 |
-
|
|
|
|
|
23 |
|
24 |
```bash
|
25 |
pip install ninja auto-gptq[triton]
|
|
|
3 |
inference: false
|
4 |
tags:
|
5 |
- auto-gptq
|
6 |
+
pipeline_tag: text-generation
|
7 |
---
|
8 |
|
9 |
|
10 |
+
# redpajama gptq: RedPajama-INCITE-Chat-3B-v1
|
11 |
|
12 |
<a href="https://colab.research.google.com/gist/pszemraj/86d2e8485df182302646ed2c5a637059/inference-with-redpajama-incite-chat-3b-v1-gptq-4bit-128g.ipynb">
|
13 |
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
14 |
</a>
|
15 |
|
16 |
+
A GPTQ quantization of the [RedPajama-INCITE-Chat-3B-v1](https://huggingface.co/togethercomputer/RedPajama-INCITE-Chat-3B-v1) via auto-gptq. Model file is only 2GB.
|
17 |
|
18 |
|
19 |
## Usage
|
20 |
|
21 |
|
22 |
> Note that you cannot load directly from the hub with `auto_gptq` yet - if needed you can use [this function](https://gist.github.com/pszemraj/8368cba3400bda6879e521a55d2346d0) to download using the repo name.
|
23 |
+
|
24 |
+
|
25 |
+
first install auto-GPTQ
|
26 |
|
27 |
```bash
|
28 |
pip install ninja auto-gptq[triton]
|