Why an instruct model may not be ideal for further fine-tuning?

#8
by LucienShui - opened

Is there any reference?

Technology Innovation Institute org
edited May 31, 2023

Unfortunately, I don't have a direct reference from the top of my head. But let's consider three different methods for finetuning:

a: Finetune(new_data) 
b: Finetune(instructions and new_data) 
c: Finetune(instructions then new_data)  <--- starting from instruct model

I would expect either a or b to yield the best results depending on the task, although there is no guarantee. So starting from the base model gives you the best flexibility to try all of the different approaches and see what works best for you.

I originally thought that less data meant less computing resources. Your point of view is also a valuable perspective. Thanks a lot. :)

LucienShui changed discussion status to closed

Sign up or log in to comment