license: apache-2.0
language:
- zh
- en
pipeline_tag: question-answering
Chinese-Alpaca-Plus-13B-GPTQ
This is GPTQ format quantised 4bit models of Yiming Cui's Chinese-LLaMA-Alpaca 13B.
It is the result of quantising to 4bit using GPTQ-for-LLaMa.
Model Details
Model Description
- Developed by: ymcui (Yiming Cui)
- Shared by: Known Rabbit
- Language(s) (NLP): Chinese, English
- License: Apache 2.0
- Finetuned from model: LLaMA
The original Github project: ymcui/Chinese-LLaMA-Alpaca: 中文LLaMA&Alpaca大语言模型+本地CPU/GPU部署 (Chinese LLaMA & Alpaca LLMs)
In order to promote the open research of large models in the Chinese NLP community, this project open sourced the Chinese LLaMA model and the Alpaca large model with fine-tuned instructions. Based on the original LLaMA, these models expand the Chinese vocabulary and use Chinese data for secondary pre-training, which further improves the basic semantic understanding of Chinese. At the same time, the Chinese Alpaca model further uses Chinese instruction data for fine-tuning, which significantly improves the model's ability to understand and execute instructions. For details, please refer to the technical report (Cui, Yang, and Yao, 2023).
Model Sources
- Repository: https://github.com/ymcui/Chinese-LLaMA-Alpaca
- Paper: [2304.08177] Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca
Uses
Direct Use
How to easily download and use this model in text-generation-webui
Open the text-generation-webui UI as normal.
- Click the Model tab.
- Under Download custom model or LoRA, enter
rabitt/Chinese-Alpaca-Plus-13B-GPTQ
. - Click Download.
- Wait until it says it's finished downloading.
- Click the Refresh icon next to Model in the top left.
- In the Model drop-down: choose the model you just downloaded,
Chinese-Alpaca-Plus-13B-GPTQ
. - If you see an error like
Error no file named pytorch_model.bin ...
in the bottom right, ignore it - it's temporary. - Fill out the
GPTQ parameters
on the right:Bits = 4
,Groupsize = 128
,model_type = Llama
- Click Save settings for this model in the top right.
- Click Reload the Model in the top right.
- Once it says it's loaded, click the Text Generation tab and enter a prompt!
Training Details
Training Procedure
Download models from the following links
Original LLaMA: https://github.com/facebookresearch/llama/pull/73
Chinese-LLaMA-Plus-13B
Chinese-Alpaca-Plus-13B
Convert LLaMA to HuggingFace (HF) format with
convert_llama_weights_to_hf.py
wget https://github.com/huggingface/transformers/raw/main/src/transformers/models/llama/convert_llama_weights_to_hf.py PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python \ python convert_llama_weights_to_hf.py \ --input_dir ./llama \ --model_size 13B \ --output_dir ./llama-13b-hf
Merge
Chinese-LLaMA-Plus-13B
andChinese-Alpaca-Plus-13B
into LLaMA withmerge_llama_with_chinese_lora.py
wget https://github.com/ymcui/Chinese-LLaMA-Alpaca/raw/main/scripts/merge_llama_with_chinese_lora.py python merge_llama_with_chinese_lora.py \ --base_model ./llama-13b-hf \ --lora_model ./Chinese-LLaMA-Plus-LoRA-13B,./Chinese-Alpaca-Plus-LoRA-13B \ --output_type huggingface \ --output_dir ./Chinese-Alpaca-Plus-13B
Quantise the model with
GPTQ-for-LLaMa
mkdir -p Chinese-Alpaca-Plus-13B-GPTQ git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa.git cd GPTQ-for-LLaMa # export CUDA_VISIBLE_DEVICES=0 python llama.py ../Chinese-Alpaca-Plus-13B c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors ../Chinese-Alpaca-Plus-13B-GPTQ/Chinese-Alpaca-Plus-13B-GPTQ-4bit-128g.safetensors
Citation
BibTeX:
@article{chinese-llama-alpaca,
title={Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca},
author={Cui, Yiming and Yang, Ziqing and Yao, Xin},
journal={arXiv preprint arXiv:2304.08177},
url={https://arxiv.org/abs/2304.08177},
year={2023}
}