Update README.md
Browse files
README.md
CHANGED
@@ -108,19 +108,19 @@ print(tokenizer.decode(output[0]))
|
|
108 |
|
109 |
## Provided files
|
110 |
|
111 |
-
**gptq_model-4bit
|
112 |
|
113 |
This will work with AutoGPTQ as of commit `3cb1bf5` (`3cb1bf5a6d43a06dc34c6442287965d1838303d3`)
|
114 |
|
115 |
-
It was created
|
116 |
|
117 |
-
* `gptq_model-4bit
|
118 |
* Works only with latest AutoGPTQ CUDA, compiled from source as of commit `3cb1bf5`
|
119 |
* At this time it does not work with AutoGPTQ Triton, but support will hopefully be added in time.
|
120 |
* Works with text-generation-webui using `--autogptq --trust_remote_code`
|
121 |
* At this time it does NOT work with one-click-installers
|
122 |
* Does not work with any version of GPTQ-for-LLaMa
|
123 |
-
* Parameters: Groupsize =
|
124 |
|
125 |
<!-- footer start -->
|
126 |
## Discord
|
|
|
108 |
|
109 |
## Provided files
|
110 |
|
111 |
+
**gptq_model-4bit--1g.safetensors**
|
112 |
|
113 |
This will work with AutoGPTQ as of commit `3cb1bf5` (`3cb1bf5a6d43a06dc34c6442287965d1838303d3`)
|
114 |
|
115 |
+
It was created without groupsize to reduce VRAM requirements, and with `desc_act` (act-order) to improve inference quality.
|
116 |
|
117 |
+
* `gptq_model-4bit--1g.safetensors`
|
118 |
* Works only with latest AutoGPTQ CUDA, compiled from source as of commit `3cb1bf5`
|
119 |
* At this time it does not work with AutoGPTQ Triton, but support will hopefully be added in time.
|
120 |
* Works with text-generation-webui using `--autogptq --trust_remote_code`
|
121 |
* At this time it does NOT work with one-click-installers
|
122 |
* Does not work with any version of GPTQ-for-LLaMa
|
123 |
+
* Parameters: Groupsize = None. Act order (desc_act)
|
124 |
|
125 |
<!-- footer start -->
|
126 |
## Discord
|