mabaochang commited on
Commit
7e11068
1 Parent(s): 4171511

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md CHANGED
@@ -1,3 +1,20 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ # GPTQ-for-Bloom
5
+ 4 bits quantization of [Bloom](https://arxiv.org/pdf/2211.05100.pdf) using [GPTQ](https://arxiv.org/abs/2210.17323)
6
+
7
+ GPTQ is SOTA one-shot weight quantization method.
8
+
9
+ The code of inference can be found in our Github project repository: https://github.com/LianjiaTech/BELLE/gptq.
10
+
11
+ **This code is based on [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa)**
12
+
13
+ ## Model list
14
+
15
+ | model name | file size | GPU memory |
16
+ | -------------------------------------------------- | ------------------- | ------------------ |
17
+ | bloom7b-2m-8bit-128g.pt | 9.7G | 11G |
18
+ | bloom7b-2m-4bit-128g.pt | 6.9G | 8G |
19
+ | bloom7b-2m-3bit-128g.pt | 6.2G | 7.7G |
20
+