barius commited on
Commit
1427fc7
1 Parent(s): 2707373

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +73 -0
README.md CHANGED
@@ -1,3 +1,13 @@
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  tags:
@@ -8,6 +18,11 @@ language:
8
  - en
9
  ---
10
  # GPTQ-for-Bloom
 
 
 
 
 
11
  8 bits quantization of [Bloom](https://arxiv.org/pdf/2211.05100.pdf) using [GPTQ](https://arxiv.org/abs/2210.17323)
12
 
13
  GPTQ is SOTA one-shot weight quantization method.
@@ -28,3 +43,61 @@ Basically, 8-bit quantization and 128 groupsize are recommended.
28
  | bloom7b-0.2m-8bit-128g.pt | 9.7G | ~11.4G |
29
  | bloom7b-0.2m-4bit-128g.pt | 6.9G | ~8.4G |
30
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - text2text-generation
5
+ pipeline_tag: text2text-generation
6
+ language:
7
+ - zh
8
+ - en
9
+ ---
10
+
11
  ---
12
  license: apache-2.0
13
  tags:
 
18
  - en
19
  ---
20
  # GPTQ-for-Bloom
21
+
22
+ ## Welcome
23
+ If you find this model helpful, please *like* this model and star us on https://github.com/LianjiaTech/BELLE !
24
+
25
+ ## Model description
26
  8 bits quantization of [Bloom](https://arxiv.org/pdf/2211.05100.pdf) using [GPTQ](https://arxiv.org/abs/2210.17323)
27
 
28
  GPTQ is SOTA one-shot weight quantization method.
 
43
  | bloom7b-0.2m-8bit-128g.pt | 9.7G | ~11.4G |
44
  | bloom7b-0.2m-4bit-128g.pt | 6.9G | ~8.4G |
45
 
46
+ ## Citation
47
+
48
+ Please cite us when using our code, data or model.
49
+
50
+ ```
51
+ @misc{BELLE,
52
+ author = {Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Baochang Ma, Xiangang Li},
53
+ title = {BELLE: Bloom-Enhanced Large Language model Engine },
54
+ year = {2023},
55
+ publisher = {GitHub},
56
+ journal = {GitHub repository},
57
+ howpublished = {\url{https://github.com/LianjiaTech/BELLE}},
58
+ }
59
+ ```
60
+
61
+ Cite the original BLOOM, Stanford Alpaca and Self-Instruct papers as well!
62
+
63
+ ***
64
+
65
+ # GPTQ-for-Bloom
66
+
67
+ ## 欢迎
68
+ 如果您觉得此模型对您有帮助,请like此模型并在https://github.com/LianjiaTech/BELLE 项目中star我们!
69
+
70
+ ## 模型描述
71
+ 对[Bloom](https://arxiv.org/pdf/2211.05100.pdf)模型使用[GPTQ](https://arxiv.org/abs/2210.17323)进行8 bit(8位)量化。
72
+
73
+ GPTQ是目前SOTA的one-shot权重量化方法。
74
+
75
+ 此模型的推理代码请见https://github.com/LianjiaTech/BELLE/gptq .
76
+
77
+ 一般来说,推荐使用8-bit量化及groupsize = 128.
78
+
79
+ **推理代码基于[GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa)**
80
+
81
+ ## 模型列表
82
+
83
+ | 模型名称 | 文件大小 | GPU显存占用 |
84
+ | -------------------------------------------------- | ------------------- | ------------------ |
85
+ | base | 27G | ~28.2G |
86
+ | bloom7b-2m-8bit-128g.pt | 9.7G | ~11.4G |
87
+ | bloom7b-2m-4bit-128g.pt | 6.9G | ~8.4G |
88
+ | bloom7b-0.2m-8bit-128g.pt | 9.7G | ~11.4G |
89
+ | bloom7b-0.2m-4bit-128g.pt | 6.9G | ~8.4G |
90
+
91
+ ## 引用
92
+ 如果使用本项目的代码、数据或模型,请引用本项目。
93
+ ```
94
+ @misc{BELLE,
95
+ author = {Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Baochang Ma, Xiangang Li},
96
+ title = {BELLE: Bloom-Enhanced Large Language model Engine },
97
+ year = {2023},
98
+ publisher = {GitHub},
99
+ journal = {GitHub repository},
100
+ howpublished = {\url{https://github.com/LianjiaTech/BELLE}},
101
+ }
102
+ ```
103
+ 也请同时引用原始的BLOOM论文、Stanford Alpaca和Self-Instruct论文。