eng-mal-translator

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.1660
Bleu: 16.9895

Overview

This project utilizes a custom dataset for training a translation model from English to Malayalam. The model leverages the facebook/nllb-200-distilled-600M architecture from Hugging Face's Transformers library, fine-tuned on the dataset. It aims to provide accurate translations from English text inputs into Malayalam.

Dataset Used

The training data consists of a curated dataset containing parallel English-Malayalam text pairs, ensuring robust training and evaluation of the translation model. https://huggingface.co/datasets/Govardhan-06/flores_eng_mal

Model Used

The translation model employed is based on the facebook/nllb-200-distilled-600M architecture, chosen for its efficiency and performance in sequence-to-sequence tasks.

Functionality

Users can input English text, and the model will generate corresponding Malayalam translations, facilitating cross-language communication and understanding.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu
No log	1.0	226	1.1084	15.0719
No log	2.0	452	1.0917	16.3698
1.1672	3.0	678	1.0952	16.2931
1.1672	4.0	904	1.0994	16.7858
0.8967	5.0	1130	1.1154	16.5906
0.8967	6.0	1356	1.1300	17.7039
0.7415	7.0	1582	1.1414	16.8886
0.7415	8.0	1808	1.1523	17.1442
0.6532	9.0	2034	1.1628	16.9454
0.6532	10.0	2260	1.1660	16.9895

Framework versions

Transformers 4.42.3
Pytorch 2.3.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Govardhan-06
/

nllb-200-distilled-600M