BelleGroup
/

BELLE-7B-gptq

Text2Text Generation

feature-extraction

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

mabaochang commited on Mar 25, 2023

Commit

2707373

•

1 Parent(s): 5681744

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -8,12 +8,14 @@ language:
 - en
 ---
 # GPTQ-for-Bloom
-4 bits quantization of [Bloom](https://arxiv.org/pdf/2211.05100.pdf) using [GPTQ](https://arxiv.org/abs/2210.17323)
 GPTQ is SOTA one-shot weight quantization method.
 The code of inference can be found in our Github project repository: https://github.com/LianjiaTech/BELLE/gptq.
 **This code is based on [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa)**
 ## Model list

 - en
 ---
 # GPTQ-for-Bloom
+8 bits quantization of [Bloom](https://arxiv.org/pdf/2211.05100.pdf) using [GPTQ](https://arxiv.org/abs/2210.17323)
 GPTQ is SOTA one-shot weight quantization method.
 The code of inference can be found in our Github project repository: https://github.com/LianjiaTech/BELLE/gptq.
+Basically, 8-bit quantization and 128 groupsize are recommended.
 **This code is based on [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa)**
 ## Model list