Finetuning Base Falcon on Unseen Language/New data (non instruct/RLHF)

#91
by AshBam - opened

I understand that the Falcon model is not meant to work on unseen languages (listed as a limitation). However, I need to do so. An instruct only finetuning is giving pretty unstable results at the moment. I've scourged the internet to find a resource to help with the same but have not been able to find the same.

Does anyone has any idea on how it can be made possible? I've been trying to go through the PEFT, LORA, deepspeed libraries related to Falcon to get some idea on reverse engineering the process. Understand how to adding new layers on top of the frozen layers and if it might be possible to unfreeze and tune other layers. However, I've not been able to find something workable.

Please help me out if there are any resources for this.

I suppose no one tried, doesn't mean it does not work.
Personally I'd try careful fine tuning of the embeddings using dictionaries of that particular language in combination with all languages Falcon knows well, so it can find connections of the new words with existing words.
Then the same on sentences with a large corpus of untrained examples to regularly test the progress.

Thanks will try something out.

Sign up or log in to comment