nllb-200-distilled-600M_dyu-fra

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.0050
Bleu: 10.2534
Gen Len: 12.4147

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
No log	1.0	127	2.2376	7.0259	12.5207
No log	2.0	254	2.1404	8.1151	12.2665
No log	3.0	381	2.0926	8.5971	12.3487
2.3978	4.0	508	2.0610	9.0697	12.4276
2.3978	5.0	635	2.0368	9.1953	12.5058
2.3978	6.0	762	2.0260	9.6559	12.5785
2.3978	7.0	889	2.0141	9.9721	12.3861
1.9131	8.0	1016	2.0068	9.9515	12.5071
1.9131	9.0	1143	2.0036	10.1965	12.3855
1.9131	10.0	1270	2.0004	10.1496	12.5085
1.9131	11.0	1397	1.9989	10.203	12.6275
1.7208	12.0	1524	1.9997	10.3765	12.5459
1.7208	13.0	1651	1.9979	10.5581	12.4963
1.7208	14.0	1778	1.9963	10.4845	12.3494
1.7208	15.0	1905	2.0023	10.4471	12.4154
1.6069	16.0	2032	2.0019	10.3647	12.395
1.6069	17.0	2159	2.0022	10.3361	12.3474
1.6069	18.0	2286	2.0040	10.2938	12.3576
1.6069	19.0	2413	2.0053	10.2612	12.3868
1.5427	20.0	2540	2.0050	10.2534	12.4147

Framework versions

Transformers 4.42.4
Pytorch 2.3.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

williamhtan
/

nllb-200-distilled-600M_dyu-fra

nllb-200-distilled-600M_dyu-fra

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for williamhtan/nllb-200-distilled-600M_dyu-fra

Evaluation results