hfl
/

chinese-mixtral-gguf

Mixture of Experts

Inference Endpoints

Model card Files Files and versions

chinese-mixtral-gguf / README.md

hfl-rc's picture

Update README.md

68e6d2e verified 10 months ago

|

1.32 kB

	---
	license: apache-2.0
	language:
	- zh
	- en
	tags:
	- moe
	---

	# Chinese-Mixtral-GGUF
	<p align="center">
	<a href="https://github.com/ymcui/Chinese-Mixtral"><img src="https://ymcui.com/images/chinese-mixtral-banner.png" width="600"/></a>
	</p>

	Chinese Mixtral GitHub repository: https://github.com/ymcui/Chinese-Mixtral

	This repository contains the GGUF-v3 models (llama.cpp compatible) for Chinese-Mixtral (this is not a chat/instruction model).

	## Performance

	Metric: PPL, lower is better

	\| Quant \| PPL \|
	\| ----- \| ---- \|
	\| IQ2_XXS \| 8.5981 +/- 0.09267 \|
	\| IQ2_XS \| 6.9784 +/- 0.07476 \|
	\| Q2_K \| 5.1846 +/- 0.05533 \|
	\| IQ3_XXS \| 4.5990 +/- 0.04969 \|
	\| Q3_K \| 4.5545 +/- 0.04893 \|
	\| Q4_0 \| 4.4917 +/- 0.04844 \|
	\| Q4_K \| 4.4488 +/- 0.04813 \|
	\| Q5_0 \| 4.4224 +/- 0.04753 \|
	\| Q5_K \| 4.4192 +/- 0.04768 \|
	\| Q6_K \| 4.4092 +/- 0.04758 \|
	\| Q8_0 \| 4.4076 +/- 0.04746 \|
	\| F16 \| x \|

	Due to the file size limitation, for F16 model, please use `cat` command to concatenate all parts into a single file. You must concatenate these parts in order.


	## Others

	- For Hugging Face version, please see: https://huggingface.co/hfl/chinese-mixtral

	- If you have questions/issues regarding this model, please submit an issue through https://github.com/ymcui/Chinese-Mixtral/.