Model Description
This model is based off Sentence-Transformer's distiluse-base-multilingual-cased
multilingual model that has been extended to understand sentence embeddings in Estonian.
Sentence-Transformers
This model can be imported directly via the SentenceTransformers package as shown below:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('kiri-ai/distiluse-base-multilingual-cased-et')
sentences = ['Here is a sample sentence','Another sample sentence']
embeddings = model.encode(sentences)
print("Sentence embeddings:")
print(embeddings)
Fine-tuning
The fine-tuning and training processes were inspired by sbert's multilingual training techniques which are available here. The documentation shows and explains the step-by-step process of using parallel sentences to train models in a different language.
Resources
The model was fine-tuned on English-Estonian parallel sentences taken from OPUS and ParaCrawl.
- Downloads last month
- 3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.