BelleGroup
/

BELLE-7B-gptq

Text2Text Generation

feature-extraction

Inference Endpoints

Model card Files Files and versions Community

BELLE-7B-gptq / README.md

mabaochang's picture

Update README.md

4060faa over 1 year ago

|

961 Bytes

	---
	license: apache-2.0
	tags:
	- text2text-generation
	pipeline_tag: text2text-generation
	language:
	- zh
	- en
	---
	# GPTQ-for-Bloom
	4 bits quantization of [Bloom](https://arxiv.org/pdf/2211.05100.pdf) using [GPTQ](https://arxiv.org/abs/2210.17323)

	GPTQ is SOTA one-shot weight quantization method.

	The code of inference can be found in our Github project repository: https://github.com/LianjiaTech/BELLE/gptq.

	This code is based on [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa)

	## Model list

	\| model name \| file size \| GPU memory \|
	\| -------------------------------------------------- \| ------------------- \| ------------------ \|
	\| bloom7b-2m-8bit-128g.pt \| 9.7G \| 11G \|
	\| bloom7b-2m-4bit-128g.pt \| 6.9G \| 8G \|
	\| bloom7b-2m-3bit-128g.pt \| 6.2G \| 7.7G \|