marianMT_hin_eng_cs
This model is a fine-tuned version of Helsinki-NLP/opus-mt-mul-en on ar5entum/hindi-english-code-mixed dataset. It achieves the following results on the evaluation set:
- Loss: 0.1450
- Bleu: 77.8649
- Gen Len: 74.8945
Model description
The model is specifically designed to translate Hindi text written in Devanagari script into a mixed format where Hindi words are retained in Devanagari while English words are converted to Roman script. This model effectively handles the complexities of code-switching, producing output that accurately reflects the intended language mixing.
Example:
Hindi | Hindi + English CS |
---|---|
तो वो टोटली मेरे घर के प्लान पे डिपेंड करता है | to वो totally मेरे घर के plan पे depend करता है |
मांग लो भाई बहुत नेसेसरी है | मांग लो भाई बहुत necessary है |
टेलीविज़न में क्या प्रोग्राम चल रहा है? | television में क्या program चल रहा है? |
from transformers import MarianMTModel, MarianTokenizer
class HinEngCS:
def __init__(self, model_name='ar5entum/marianMT_hin_eng_cs'):
self.model_name = model_name
self.tokenizer = MarianTokenizer.from_pretrained(model_name)
self.model = MarianMTModel.from_pretrained(model_name)
def predict(self, input_text):
tokenized_text = self.tokenizer(input_text, return_tensors='pt')
translated = self.model.generate(**tokenized_text)
translated_text = self.tokenizer.decode(translated[0], skip_special_tokens=True)
return translated_text
model = HinEngCS()
input_text = "आज मैं नानयांग टेक्नोलॉजिकल यूनिवर्सिटी में अनेक समझौते होते हुए देखूंगा जो कि उच्च शिक्षा साइंस टेक्नोलॉजी और इनोवेशन में हमारे सहयोग को और बढ़ाएंगे।"
model.predict(input_text)
# आज मैं नानयांग technological university में अनेक समझौते होते हुए देखूंगा जो कि उच्च शिक्षा science technology और innovation में हमारे सहयोग को और बढ़ाएंगे।
Training Procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 50
- eval_batch_size: 50
- seed: 42
- distributed_type: multi-GPU
- num_devices: 2
- total_train_batch_size: 100
- total_eval_batch_size: 100
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30.0
Training results
Training Loss | Epoch | Step | Bleu | Gen Len | Validation Loss |
---|---|---|---|---|---|
1.5823 | 1.0 | 1118 | 11.6257 | 77.1622 | 1.1778 |
0.921 | 2.0 | 2236 | 33.2917 | 76.1459 | 0.6357 |
0.6472 | 3.0 | 3354 | 47.3533 | 75.9194 | 0.4504 |
0.5246 | 4.0 | 4472 | 55.2169 | 75.6871 | 0.3579 |
0.4228 | 5.0 | 5590 | 60.8262 | 75.5777 | 0.3041 |
0.3745 | 6.0 | 6708 | 64.8987 | 75.4424 | 0.2693 |
0.3552 | 7.0 | 7826 | 67.7607 | 75.2438 | 0.2455 |
0.3324 | 8.0 | 8944 | 69.635 | 75.1036 | 0.2274 |
0.2912 | 9.0 | 10062 | 71.3086 | 75.0326 | 0.2117 |
0.2591 | 10.0 | 11180 | 72.392 | 74.9607 | 0.2001 |
0.2471 | 11.0 | 12298 | 73.4758 | 74.9251 | 0.1899 |
0.236 | 12.0 | 13416 | 74.4219 | 74.833 | 0.1822 |
0.2265 | 13.0 | 14534 | 75.1435 | 74.9069 | 0.1745 |
0.2152 | 14.0 | 15652 | 75.7614 | 74.7409 | 0.1695 |
0.2078 | 15.0 | 16770 | 76.2353 | 74.7092 | 0.1641 |
0.2048 | 16.0 | 17888 | 76.7381 | 74.7274 | 0.1593 |
0.1975 | 17.0 | 19006 | 76.9954 | 74.7217 | 0.1559 |
0.1943 | 18.0 | 20124 | 77.421 | 74.6641 | 0.1524 |
0.1987 | 19.0 | 21242 | 77.8231 | 74.6833 | 0.1495 |
0.1855 | 20.0 | 22360 | 78.0784 | 74.6804 | 0.1472 |
Framework versions
- Transformers 4.45.0.dev0
- Pytorch 2.4.0+cu121
- Datasets 2.21.0
- Tokenizers 0.19.1
- Downloads last month
- 5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for ar5entum/marianMT_hin_eng_cs
Base model
Helsinki-NLP/opus-mt-mul-en