Edit model card

Visualize in Weights & Biases

summarizer-tamil-mbart

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.4252
  • Rouge1: 9.3056
  • Rouge2: 2.0
  • Rougel: 9.2889
  • Rougelsum: 9.2222
  • Gen Len: 39.2233

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 5
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Gen Len Validation Loss Rouge1 Rouge2 Rougel Rougelsum
4.5487 0.2963 200 39.11 4.1027 2.1667 0.3333 2.1111 2.1667
4.0997 0.5926 400 41.7633 3.9744 2.2222 0.3333 2.2593 2.2222
4.0165 0.8889 600 52.0967 3.9417 2.6667 0.6667 2.6667 2.7778
3.7801 1.1852 800 45.3967 3.9424 2.1444 0.0 2.2222 2.1333
3.7308 1.4815 1000 41.3833 3.9573 2.7333 0.2222 2.6063 2.7905
3.7946 1.7778 1200 35.37 3.8979 1.0571 0.2222 0.9619 1.0571
3.6338 2.0741 1400 30.9567 3.9569 1.6611 0.3333 1.6333 1.6833
3.2282 2.3704 1600 42.4933 3.0726 4.0698 0.3889 3.9754 3.9825
3.1351 2.6667 1800 38.48 3.0771 2.8333 0.0 2.8095 2.8333
3.1739 2.9630 2000 40.04 3.0871 2.4921 0.0 2.496 2.4762
2.8247 3.2593 2200 39.95 3.0882 3.4706 0.2222 3.4421 3.4357
2.7748 3.5556 2400 38.29 3.0735 3.0 0.0 3.0 3.0
2.5244 3.8519 2600 2.4450 7.3889 1.2222 7.4667 7.5 38.1767
2.5382 4.1481 2800 2.4365 8.1111 1.9744 8.2111 8.1667 39.3333
2.4642 4.4444 3000 2.4334 8.3889 2.1905 8.5389 8.4444 37.7767
2.4641 4.7407 3200 2.4252 9.3056 2.0 9.2889 9.2222 39.2233

Framework versions

  • Transformers 4.41.0
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
611M params
Tensor type
F32
·
Inference API
This model can be loaded on Inference API (serverless).

Finetuned from