tiiuae/falcon-40b-instruct · Why an instruct model may not be ideal for further fine-tuning?

LucienShui

May 31, 2023

Is there any reference?

FalconLLM

Technology Innovation Institute org May 31, 2023

•

edited May 31, 2023

Unfortunately, I don't have a direct reference from the top of my head. But let's consider three different methods for finetuning:

a: Finetune(new_data) 
b: Finetune(instructions and new_data) 
c: Finetune(instructions then new_data)  <--- starting from instruct model

I would expect either a or b to yield the best results depending on the task, although there is no guarantee. So starting from the base model gives you the best flexibility to try all of the different approaches and see what works best for you.

LucienShui

May 31, 2023

I originally thought that less data meant less computing resources. Your point of view is also a valuable perspective. Thanks a lot. :)

LucienShui changed discussion status to closed May 31, 2023