Support for LoRA?

#3
by cekal - opened

Hi, does this have support for LoRA training? I've tried to finetune the model on this (slightly modified) code: https://github.com/tloen/alpaca-lora/blob/main/finetune.py and it started training but learning rate 0.0, odd loss between 40 - 50 (LLaMA models usually have under 2) and when the adapter was run along with this base model after the training finished, the model gave weird outputs.

If anyone tried to finetune using LoRA and were successful, please share how you did so. Any help is much appreciated! 👍🤝

If someone managed to train with QLoRA, please share your results

you can check the code of alpaca . maybe not change too much code.
https://github.com/tloen/alpaca-lora/blob/main/finetune.py

@zhangbo2008 as you can see in my initial message, I mentioned that I already tried it, just modified the script a bit so it would fit the correct “AutoModelForCasualLM” and “AutoTokenizer” etc. Seems like it is not working correctly, something is wrong.

ok, i will try

Alright. You’ll probably get the training running but if the training loss is too high and it shows learning rate 0.0 chances are the model is training incorrectly (final adapter will then be useless). This might be due to an issue in modeling_RW.py that does not allow support (is not configured) for LoRA

If it works for you, please share your version of the code. Thanks!

@zhangbo2008 I found this: https://github.com/lvwerra/trl/blob/main/examples/sentiment/scripts/gpt-neox-20b_peft/clm_finetune_peft_imdb.py

Maybe slightly modifying this code could do the thing? I’m not home so I can’t run the training but this could possibly work

@cekal @xiao111 - I can confirm that the mentioned modification in the notebook actually works. I was able to finetune / further train falcon-7b with an instruction following strategy. Keep in mind that after training you need to merge the new weights back into the original model files in order to be able to set trust_remote_code to True.

FalconLLM pinned discussion

thanks for sharing . i am so xxx cause my colab only 15g gpu not work for 4biteint_mode.

I have tried fine-tuning falcon 7b with qlora using axolotl, and it seems to work: https://github.com/OpenAccess-AI-Collective/axolotl/pull/132

If you encounter any issue with the config or spot any problems in the config, please ping me in the PR. Thanks!

@cekal how did you get it to work finally ? I also tried modifying the Alpaca-LORA code by changing AutoTokenizer & AutoModelForCausalLM.
Also changed lora_target_module to ["query_key_value"].

I get the error ValueError: The length of enable_lora must divide out_features

EDIT - Fixed by updating the packages.

I have tried fine-tuning falcon 7b with qlora using axolotl, and it seems to work: https://github.com/OpenAccess-AI-Collective/axolotl/pull/132

If you encounter any issue with the config or spot any problems in the config, please ping me in the PR. Thanks!

hi, @utensil I have compared qlora.yml and lora.yml on falcon 7B. The main difference seems to be only these fields

load_in_8bit: true
load_in_4bit: false
optimizer: paged_adamw_32bit

Is there any other difference?

Technology Innovation Institute org

If you are interested in finetuning the models, we would recommend having a look at FalconTune (which supports finetuning in 4-bit) or at this blogpost from HF, specifically at the section on finetuning the model with PEFT.

@cekal how did you get it to work finally ? I also tried modifying the Alpaca-LORA code by changing AutoTokenizer & AutoModelForCausalLM.
Also changed lora_target_module to ["query_key_value"].

I get the error ValueError: The length of enable_lora must divide out_features

EDIT - Fixed by updating the packages.

hi sumegh, which package are you referring to ? I get the same error and don't know how to fix it.

@cekal how did you get it to work finally ? I also tried modifying the Alpaca-LORA code by changing AutoTokenizer & AutoModelForCausalLM.
Also changed lora_target_module to ["query_key_value"].

I get the error ValueError: The length of enable_lora must divide out_features

EDIT - Fixed by updating the packages.

hi sumegh, which package are you referring to ? I get the same error and don't know how to fix it.

I updated my CUDA version to 11.8 and re-installed all packages following the jupyter notebook as it is. It worked.

@FalconLLM
Is there any literature published on the internal architecture of the decoder blocks and how they are organized? Are there any plans for a publication anytime soon?
I would like to experiment with only touching certain submodules (instead of all) with LoRA/QLoRA adapters during fine tuning for some understanding on how the self attention block and the mlp block across various decoder layers contribute to overall model performance.

Sign up or log in to comment