hfl
/

chinese-mixtral

Text Generation

Mixture of Experts

text-generation-inference

Inference Endpoints

Model card Files Files and versions

chinese-mixtral / README.md

hfl-rc's picture

Update README.md

13addc1 verified 10 months ago

|

1.29 kB

	---
	license: apache-2.0
	language:
	- zh
	- en
	tags:
	- moe
	---

	# Chinese-Mixtral
	<p align="center">
	<a href="https://github.com/ymcui/Chinese-Mixtral"><img src="https://ymcui.com/images/chinese-mixtral-banner.png" width="600"/></a>
	</p>

	Chinese Mixtral GitHub repository: https://github.com/ymcui/Chinese-Mixtral

	This repository contains Chinese-Mixtral, which is further pre-trained on [Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1).

	Note: this is a foundation model, which is not suitable for conversation, QA, etc.

	## Others

	- For LoRA-only model, please see: https://huggingface.co/hfl/chinese-mixtral-lora

	- For GGUF model (llama.cpp compatible), please see: https://huggingface.co/hfl/chinese-mixtral-gguf

	- If you have questions/issues regarding this model, please submit an issue through https://github.com/ymcui/Chinese-Mixtral/.

	## Citation

	Please consider cite our paper if you use the resource of this repository.
	Paper link: https://arxiv.org/abs/2403.01851
	```
	@article{chinese-mixtral,
	title={Rethinking LLM Language Adaptation: A Case Study on Chinese Mixtral},
	author={Cui, Yiming and Yao, Xin},
	journal={arXiv preprint arXiv:2403.01851},
	url={https://arxiv.org/abs/2403.01851},
	year={2024}
	}
	```