Finetuning Scripts

by abrakaa - opened May 22, 2024

Discussion

abrakaa

May 22, 2024

Does anyone happen to have the fine-tuning scripts for this model? What is the minimum GPU requirement for fine-tuning?

donniems

May 22, 2024

•

edited May 22, 2024

@abrakaa Thanks for your interest in our model. We may open-source the finetuning script within the near future.

aaditya

May 22, 2024

@donniems We need it soon :)

ar9av

May 27, 2024

@donniems looking forward to it

Elizabeth819

May 27, 2024

@donniems Need it now! Really looking forward to it!

wangzhangup

May 28, 2024

@donniems Can’t wait to finetune on my own data!

@abrakaa Thanks for your interest in our model. We may open-source the finetuning script within the near future.

JohnSmith9982

May 31, 2024

I wrote a finetuning script for Phi3-V: https://github.com/GaiZhenbiao/Phi3V-Finetuning/tree/main , enjoy 😉

bdytx5

May 31, 2024

here you go https://wandb.ai/byyoung3/mlnews3/reports/How-to-fine-tune-Phi-3-Vision-on-a-custom-dataset--Vmlldzo4MTEzMTg3

2U1

Jun 19, 2024

•

edited Jun 21, 2024

https://github.com/2U1/Phi3-Vision-ft

I've made a code that has option to fine-tune full module (including vision model) like llava-1.6 !

nguyenbh

Microsoft org Jul 9, 2024

Thank you all your interest in Phi-3 Vision model.
This is the finetuning recipe https://github.com/microsoft/Phi-3CookBook/blob/main/md/04.Fine-tuning/FineTuning_Vision.md

eddtsoi

Jul 12, 2024

@nguyenbh Hi thank you so much for the finetune example! The example stated that the average tokens of DocVQA dataset is about 2XXX, I wonder if there is a long input would that cause OOM error? Does script have protection about that?

YenChunChen

Jul 12, 2024

@eddtsoi there's no explicit code for handling OOM. Note that the 2k average #tokens include the image tokens. In some cases that doesn't require high resolution, it is possible to finetune with a lower --num_crops to reduce the sequence length. For full finetuning, we have tested on 4x 48GB and 8x 32GB GPUs. If using (q)lora, I believe a single consumer level 24GB GPU works.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment