Edit model card

ko-en_mbartLarge_exp10p

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1283
  • Bleu: 28.8237
  • Gen Len: 18.5382

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.4782 0.31 2000 1.4360 21.538 18.6032
1.3618 0.62 4000 1.3226 23.8354 18.5594
1.2983 0.93 6000 1.2637 25.0795 18.7894
1.2065 1.24 8000 1.2371 25.7409 18.5615
1.1926 1.55 10000 1.2116 26.0527 18.4019
1.1734 1.86 12000 1.1907 26.9802 18.6141
1.0677 2.17 14000 1.1802 27.1925 18.4547
1.0773 2.48 16000 1.1655 27.5641 18.6726
1.0688 2.78 18000 1.1521 27.6261 18.6127
0.9542 3.09 20000 1.1709 27.16 18.3782
0.9531 3.4 22000 1.1435 28.0684 18.436
0.9756 3.71 24000 1.1565 27.6025 18.7284
0.9964 4.02 26000 1.2285 25.6999 18.3255
0.9721 4.33 28000 1.1881 27.3499 18.5409
0.9237 4.64 30000 1.1497 28.2692 18.6614
0.9041 4.95 32000 1.1283 28.8215 18.5493
0.6842 5.26 34000 1.1741 28.6873 18.515
0.7101 5.57 36000 1.1876 28.0778 18.3422
0.7697 5.88 38000 1.1898 27.6338 18.6766
0.6028 6.19 40000 1.2393 28.0713 18.5903

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.5
  • Tokenizers 0.14.1
Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for yesj1234/ko-en_mbartLarge_exp15

Finetuned
(103)
this model