A newer version of this model is available:
syubraj/romaneng2nep_v3
Model Card for Model ID
Due to compute issues, The model has been trained on multiple iterations:
- Model Trained for 8500 steps on [0 : 5%] of the dataset.
- Model continued from 8500 upto 16500 steps on [5% : 20%] of the dataset
- Model continued from 16500 upto 22000 steps on [20% : 40%] of the dataset
Model Description
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Model type: (Translation)
- Language(s) (NLP): Nepali, English
- License: [Apache license 2.0]
- Finetuned from model : [google/mt5-small]
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import AutoTokenizer, MT5ForConditionalGeneration
checkpoint = "syubraj/RomanEng2Nep-v2"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = MT5ForConditionalGeneration.from_pretrained(checkpoint)
# Set max sequence length
max_seq_len = 20
def translate(text):
# Tokenize the input text with a max length of 20
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=max_seq_len)
# Generate translation
translated = model.generate(**inputs)
# Decode the translated tokens back to text
translated_text = tokenizer.decode(translated[0], skip_special_tokens=True)
return translated_text
# Example usage
source_text = "muskuraudai" # Example Romanized Nepali text
translated_text = translate(source_text)
print(f"Translated Text: {translated_text}")
Training Data
syubraj/roman2nepali-transliteration
Training Hyperparameters
- Training regime:
training_args = Seq2SeqTrainingArguments(
output_dir="/content/drive/MyDrive/romaneng2nep_v2/",
eval_strategy="steps",
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=8,
weight_decay=0.01,
save_total_limit=3,
num_train_epochs=2,
predict_with_generate=True,
)
Training and Validation Metrics
Step | Training Loss | Validation Loss | Gen Len |
---|---|---|---|
500 | 21.636200 | 9.776628 | 2.001900 |
1000 | 10.103400 | 6.105016 | 2.077900 |
1500 | 6.830800 | 5.081259 | 3.811600 |
2000 | 6.003100 | 4.702793 | 4.237300 |
2500 | 5.690200 | 4.469123 | 4.700000 |
3000 | 5.443100 | 4.274406 | 4.808300 |
3500 | 5.265300 | 4.121417 | 4.749400 |
4000 | 5.128500 | 3.989708 | 4.782300 |
4500 | 5.007200 | 3.885391 | 4.805100 |
5000 | 4.909600 | 3.787640 | 4.874800 |
5500 | 4.836000 | 3.715750 | 4.855500 |
6000 | 4.733000 | 3.640963 | 4.962000 |
6500 | 4.673500 | 3.587330 | 5.011600 |
7000 | 4.623800 | 3.531883 | 5.068300 |
7500 | 4.567400 | 3.481622 | 5.108500 |
8000 | 4.523200 | 3.445404 | 5.092700 |
8500 | 4.464000 | 3.413630 | 5.132700 |
9000 | 4.423100 | 3.326201 | 5.211700 |
9500 | 4.315700 | 3.238422 | 5.200600 |
10000 | 4.218200 | 3.143774 | 5.288100 |
10500 | 4.133600 | 3.080613 | 5.202300 |
11000 | 4.087700 | 3.011713 | 5.271800 |
11500 | 4.004300 | 2.957386 | 5.178700 |
12000 | 3.956700 | 2.898953 | 5.209600 |
12500 | 3.922800 | 2.850440 | 5.210100 |
13000 | 3.853400 | 2.796974 | 5.171700 |
13500 | 3.807900 | 2.745325 | 5.281200 |
14000 | 3.755700 | 2.708517 | 5.223000 |
14500 | 3.729300 | 2.678200 | 5.210700 |
15000 | 3.673600 | 2.637842 | 5.230200 |
15500 | 3.625400 | 2.607649 | 5.264100 |
16000 | 3.601100 | 2.592188 | 5.129800 |
16500 | 3.608200 | 2.556329 | 5.215800 |
17000 | 3.557900 | 2.536781 | 5.162900 |
17500 | 3.533500 | 2.504695 | 5.206000 |
18000 | 3.500000 | 2.477887 | 5.211600 |
18500 | 3.463600 | 2.456758 | 5.201000 |
19000 | 3.457100 | 2.433362 | 5.210000 |
19500 | 3.435400 | 2.411479 | 5.197600 |
20000 | 3.413300 | 2.392534 | 5.221100 |
20500 | 3.366100 | 2.378421 | 5.165200 |
21000 | 3.363500 | 2.357117 | 5.187300 |
21500 | 3.346500 | 2.343485 | 5.193600 |
22000 | 3.328300 | 2.331021 | 5.183300 |
- Downloads last month
- 102
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.