Training

by freegheist - opened May 8, 2024

Discussion

freegheist

May 8, 2024

Any chance for hyperparameters or training config? :)

migtissera

Owner May 9, 2024

Here you go my man: https://gist.github.com/mtisz/5cd0e72844e552fd06e77535c81bbfae

This was for a 4xA100 machine. Play around with:

learning_rate (the learning rate)
lora_r (dimension of the LoRA adapters)
gradient_accumulation_steps and
micro_batch_size

Make sure to comment out fsdp and fsdp_config sections when you're ready to merge the QLoRA adapter, as there's a bug in Axolotl that makes the model merging hang.

migtissera changed discussion status to closed May 9, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment