Edit model card

大模型的量化目前主流有3种,

  • GGUF,主要是给CPU使用。
  • GPTQ, GPU使用。
  • AWQ,可以理解是一种更新的量化工具,比GPTQ量化带来更好的性能。

notebook

AWQ 量化

还有些地方需要理解。

原文notebook

Downloads last month
9
Safetensors
Model size
1.13B params
Tensor type
I32
·
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.