Orion-zhen
/

Qwen2-72B-Instruct-mix-calibration-4b-exl2

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Orion-zhen commited on Jun 11

Commit

cb8cbe0

•

1 Parent(s): 4256392

Update README.md

Files changed (1) hide show

README.md +6 -0

README.md CHANGED Viewed

@@ -11,6 +11,12 @@ tags:
 # Qwen2-72B-Instruct
 ## Introduction
 Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the instruction-tuned 72B Qwen2 model.

 # Qwen2-72B-Instruct
+## Quantization
+This model is an exl2 quantisized model using [mixed-exl-calibration](https://huggingface.co/datasets/Orion-zhen/mixed-exl-calibration) as calibration dataset.
+Compared to normal wikitext calibration, this could provide slightly better performance on both English and Chinese, etc.
 ## Introduction
 Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the instruction-tuned 72B Qwen2 model.