Edit model card

Visualize in Weights & Biases

nllb-200-distilled-600M_dyu-fra

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0050
  • Bleu: 10.2534
  • Gen Len: 12.4147

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 127 2.2376 7.0259 12.5207
No log 2.0 254 2.1404 8.1151 12.2665
No log 3.0 381 2.0926 8.5971 12.3487
2.3978 4.0 508 2.0610 9.0697 12.4276
2.3978 5.0 635 2.0368 9.1953 12.5058
2.3978 6.0 762 2.0260 9.6559 12.5785
2.3978 7.0 889 2.0141 9.9721 12.3861
1.9131 8.0 1016 2.0068 9.9515 12.5071
1.9131 9.0 1143 2.0036 10.1965 12.3855
1.9131 10.0 1270 2.0004 10.1496 12.5085
1.9131 11.0 1397 1.9989 10.203 12.6275
1.7208 12.0 1524 1.9997 10.3765 12.5459
1.7208 13.0 1651 1.9979 10.5581 12.4963
1.7208 14.0 1778 1.9963 10.4845 12.3494
1.7208 15.0 1905 2.0023 10.4471 12.4154
1.6069 16.0 2032 2.0019 10.3647 12.395
1.6069 17.0 2159 2.0022 10.3361 12.3474
1.6069 18.0 2286 2.0040 10.2938 12.3576
1.6069 19.0 2413 2.0053 10.2612 12.3868
1.5427 20.0 2540 2.0050 10.2534 12.4147

Framework versions

  • Transformers 4.42.4
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
12
Safetensors
Model size
615M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for williamhtan/nllb-200-distilled-600M_dyu-fra

Finetuned
(79)
this model