mabaochang commited on
Commit
2707373
1 Parent(s): 5681744

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -8,12 +8,14 @@ language:
8
  - en
9
  ---
10
  # GPTQ-for-Bloom
11
- 4 bits quantization of [Bloom](https://arxiv.org/pdf/2211.05100.pdf) using [GPTQ](https://arxiv.org/abs/2210.17323)
12
 
13
  GPTQ is SOTA one-shot weight quantization method.
14
 
15
  The code of inference can be found in our Github project repository: https://github.com/LianjiaTech/BELLE/gptq.
16
 
 
 
17
  **This code is based on [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa)**
18
 
19
  ## Model list
 
8
  - en
9
  ---
10
  # GPTQ-for-Bloom
11
+ 8 bits quantization of [Bloom](https://arxiv.org/pdf/2211.05100.pdf) using [GPTQ](https://arxiv.org/abs/2210.17323)
12
 
13
  GPTQ is SOTA one-shot weight quantization method.
14
 
15
  The code of inference can be found in our Github project repository: https://github.com/LianjiaTech/BELLE/gptq.
16
 
17
+ Basically, 8-bit quantization and 128 groupsize are recommended.
18
+
19
  **This code is based on [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa)**
20
 
21
  ## Model list