Fine-tuning and multilingual capability

#5
by adrien-alloreview - opened

First I find this embedding very very interesting.
Indeed, I've always been frustrated by the fact that it was not possible to "explain" to an embedding what is the purpose of the embedding. Thanks to your work it is now possible.
I would like to know if you plan to make this model multilingual and how would it be possible to fine-tune it to be multilingual and to fine tuning to more specific task ?

Thanks in advance

NLP Group of The University of Hong Kong org

Thank you very much for your interests!

We are considering making this model multilingual. It is very easy to finetune the model on more specific tasks. You may prepare the data following the format in https://github.com/HKUNLP/instructor-embedding#training, store them as a json file and name it as medi-data.json. Next, just follow the README: https://github.com/HKUNLP/instructor-embedding#train-instructor, and train the model!

If you encounter any problem, feel free to leave your question here or contact me at hjsu@cs.hku.hk!

Sign up or log in to comment