Edit model card

romaneng2nep_v2

This model is a fine-tuned version of google/mt5-small on an syubraj/roman2nepali-transliteration. It achieves the following results on the evaluation set:

  • Loss: 2.9652
  • Gen Len: 5.1538

MOdel Usage

!pip install transformers
from transformers import AutoTokenizer, MT5ForConditionalGeneration

checkpoint = "syubraj/romaneng2nep_v3"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = MT5ForConditionalGeneration.from_pretrained(checkpoint)

# Set max sequence length
max_seq_len = 20

def translate(text):
    # Tokenize the input text with a max length of 20
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=max_seq_len)

    # Generate translation
    translated = model.generate(**inputs)

    # Decode the translated tokens back to text
    translated_text = tokenizer.decode(translated[0], skip_special_tokens=True)
    return translated_text

# Example usage
source_text = "muskuraudai"  # Example Romanized Nepali text
translated_text = translate(source_text)
print(f"Translated Text: {translated_text}")

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 24
  • eval_batch_size: 24
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 4

Training results

Step Training Loss Validation Loss Gen Len
1000 15.0703 5.6154 2.3840
2000 6.0460 4.4449 4.6281
3000 5.2580 3.9632 4.7790
4000 4.8563 3.6188 5.0053
5000 4.5602 3.3491 5.3085
6000 4.3146 3.1572 5.2562
7000 4.1228 3.0084 5.2197
8000 3.9695 2.8727 5.2140
9000 3.8342 2.7651 5.1834
10000 3.7319 2.6661 5.1977
11000 3.6485 2.5864 5.1536
12000 3.5541 2.5080 5.1990
13000 3.4959 2.4464 5.1775
14000 3.4315 2.3931 5.1747
15000 3.3663 2.3401 5.1625
16000 3.3204 2.3034 5.1481
17000 3.2417 2.2593 5.1663
18000 3.2186 2.2283 5.1351
19000 3.1822 2.1946 5.1573
20000 3.1449 2.1690 5.1649
21000 3.1067 2.1402 5.1624
22000 3.0844 2.1258 5.1479
23000 3.0574 2.1066 5.1518
24000 3.0357 2.0887 5.1446
25000 3.0136 2.0746 5.1559
26000 2.9957 2.0609 5.1658
27000 2.9865 2.0510 5.1791
28000 2.9765 2.0456 5.1574
29000 2.9675 2.0386 5.1620
30000 2.9678 2.0344 5.1601
31000 2.9652 2.0320 5.1538

Framework versions

  • Transformers 4.45.1
  • Pytorch 2.4.0
  • Datasets 3.0.1
  • Tokenizers 0.20.0

Citation

If you find this model useful, please site the work.

@misc {yubraj_sigdel_2024,
    author       = { {Yubraj Sigdel} },
    title        = { romaneng2nep_v3 (Revision dca017e) },
    year         = 2024,
    url          = { https://huggingface.co/syubraj/romaneng2nep_v3 },
    doi          = { 10.57967/hf/3252 },
    publisher    = { Hugging Face }
}
Downloads last month
45
Safetensors
Model size
300M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for syubraj/romaneng2nep_v3

Base model

google/mt5-small
Finetuned
(303)
this model

Dataset used to train syubraj/romaneng2nep_v3

Space using syubraj/romaneng2nep_v3 1