minlik/chinese-llama-plus-7b-merged · Question regarding lora params

May 31, 2023

Thanks for sharing the model! I have one question: As in the original github repo, I noticed they use lora to train the model, while you directly load the model using LlamaForCausalLM (w/o lora params). So I wonder what's the difference between the model and the original one? Thank you!

minlik

Owner May 31, 2023

The weights are the same. I merged the Lora weight with the original model weight, allowing this model to be loaded with LlamaForCausalLM and fine-tuned directly.

zhehuderek

May 31, 2023

Thanks for answering. I think if the lora weights exist, usually you need to load model with codes like:

based_model = LlamaForCausalLM.from_pretrained(xxx)
model = PeftModel.from_pretrained(based_model, xxx)  # say if you use Peft for lora implementation

Thus I'm still confused that how did you merge the lora weights and load the model with LlamaForCausalLM, as the huggingface implementation of LlamaForCausalLM does not included any params of lora, right? I would really appreciate your help on this.

Best

minlik

Owner Jun 1, 2023

You can refer to the following links, including the project instruction and the code script.
https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/%E6%89%8B%E5%8A%A8%E6%A8%A1%E5%9E%8B%E5%90%88%E5%B9%B6%E4%B8%8E%E8%BD%AC%E6%8D%A2#%E5%A4%9Alora%E6%9D%83%E9%87%8D%E5%90%88%E5%B9%B6%E9%80%82%E7%94%A8%E4%BA%8Echinese-alpaca-plus
https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/merge_llama_with_chinese_lora.py

zhehuderek changed discussion status to closed Jun 2, 2023