Just couple of confusion to clear

#3
by Talha - opened

Hi, first of all thanks. I have some confusion, can you please clear that

  1. Did you use the jupyter notebook or the script
  2. How did you merge the adapter. Can you please share the script.
  3. What is GPU memory and system memory to required to fine tune the model and how much time it took
    Thanks
  1. I used the script. The notebook is only for inference as far as I can tell.
  2. I used a script TheBloke posted quite a while ago. Here is a link to the comment where he shared the script.
  3. For the 7b model it only required around 8GB of VRAM and 5GB of RAM. The time will depend very heavily on what GPU you use. I ran it on a rented 4090 and it took around 2 hours. I also trained the 13b model on the 4090 but I can't recall the exact amount of VRAM that required, that took around 6 hours to run. For the 70b model I used an A100 mostly due to the speed advantage it gave, though even with that it took around 38 hours to run.

Thanks a lot, i will try and see if i can produce some thing.

Good luck. Though if you are training your own model I'd recommend looking into more polished training tools like Axolotl (which supports resuming training) or H2O LLM Studio (which has a relatively easy to use GUI). The main reason I used the QLoRA script directly was that I wanted to follow the original Guanaco training as closely as possible.

thanks, i will look into both, i was not aware of any.

Sign up or log in to comment