justinthelaw
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -39,9 +39,11 @@ This repo contains GPTQ 4-bit, 32g Group Size, quantized model files from the No
|
|
39 |
|
40 |
Models are released as sharded safetensors files.
|
41 |
|
42 |
-
| Bits | GS | GPTQ Dataset | Seq Len | Size |
|
43 |
-
| ---- | -- | ----------- | ------- | ---- |
|
44 |
-
| 4 | 32 | [VMWare Open Instruct](https://huggingface.co/datasets/vmware/open-instruct) |
|
|
|
|
|
45 |
|
46 |
<!-- README_GPTQ.md-provided-files end -->
|
47 |
|
|
|
39 |
|
40 |
Models are released as sharded safetensors files.
|
41 |
|
42 |
+
| Bits | GS | GPTQ Dataset | Max Seq Len | Size | VRAM |
|
43 |
+
| ---- | -- | ----------- | ------- | ---- | ---- |
|
44 |
+
| 4 | 32 | [VMWare Open Instruct](https://huggingface.co/datasets/vmware/open-instruct) | 32,768 | 4.57 GB | 19-23 Gb*
|
45 |
+
|
46 |
+
* Depends on maximum sequence length parameter (KV cache utilization) used with vLLM or Transformers
|
47 |
|
48 |
<!-- README_GPTQ.md-provided-files end -->
|
49 |
|