Whether the merged model is fine-tuned

#1
by syGOAT - opened

Congratulations on the breakthrough you've acheieved with model merging! I'd like to ask you some questions. Did you fine-tune the model after merging it, or was it not fine-tuned? If you fine-tuned the model, did you consider the difference between fine-tuning after merging and fine-tuning the base model directly?

It says

finetuned with argilla/distilabel-intel-orca-dpo-pairs

Hey @syGOAT . The model was finetuned after the merge, since it's based off Labonne's merged model.

As for comparisons, with the actual base model, I didn't do that, since the merge model consisted of 2 models, one of which, was another merge of several models.

Hey @syGOAT . The model was finetuned after the merge, since it's based off Labonne's merged model.

As for comparisons, with the actual base model, I didn't do that, since the merge model consisted of 2 models, one of which, was another merge of several models.

Thanks for your answer! Fine-tuning after merging, merging after fine-tuning... The future of LLM is becoming more diverse and colorful.

Sign up or log in to comment