Edit model card

ko-en_mbartLarge_exp20p_linear_alpha

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1682
  • Bleu: 29.1144
  • Gen Len: 18.5459

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.404 0.46 4000 1.3738 22.5375 18.6852
1.2629 0.93 8000 1.2458 25.3741 18.7797
1.1951 1.39 12000 1.2067 26.1281 18.6597
1.1317 1.86 16000 1.1768 26.5384 19.2055
0.9906 2.32 20000 1.1363 28.2459 18.7269
0.9894 2.78 24000 1.1239 28.5124 18.6882
0.8965 3.25 28000 1.1278 28.5335 18.4917
0.9138 3.71 32000 1.1216 28.8189 18.7873
0.8272 4.18 36000 1.1468 28.332 18.6516
0.8753 4.64 40000 1.1345 28.2695 18.4919
0.6855 5.11 44000 1.1542 28.7913 18.7596
0.7088 5.57 48000 1.1531 29.0865 18.6626
0.6738 6.03 52000 1.1906 28.0235 18.4243
0.6763 6.5 56000 1.1941 28.1501 18.6932
0.6594 6.96 60000 1.1682 29.1144 18.5459
0.5971 7.43 64000 1.2449 27.9464 18.4482
0.5935 7.89 68000 1.2156 28.6034 18.5967
0.5383 8.35 72000 1.2927 27.891 18.6539
0.6022 8.82 76000 1.2831 27.7624 18.5558

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.5
  • Tokenizers 0.14.1
Downloads last month
3
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for yesj1234/ko-en_mbartLarge_exp20p_linear_lr_3gram

Finetuned
(103)
this model