Edit model card

ntu_adl_summarization_mt5_s

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.6583
  • Rouge-1: 21.9729
  • Rouge-2: 7.6735
  • Rouge-l: 19.7497
  • Ave Gen Len: 17.3098

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge-1 Rouge-2 Rouge-l Ave Gen Len
5.4447 1.0 1357 4.1235 17.7916 5.9785 16.5599 12.7161
4.7463 2.0 2714 3.9569 19.6608 6.7631 18.0768 14.8245
4.5203 3.0 4071 3.8545 20.5626 7.0737 18.7628 16.3307
4.4285 4.0 5428 3.7825 21.0690 7.2030 19.0863 16.7841
4.3196 5.0 6785 3.7269 21.2881 7.3307 19.2588 16.9276
4.2662 6.0 8142 3.7027 21.5793 7.5122 19.4806 17.0333
4.2057 7.0 9499 3.6764 21.7949 7.5987 19.6082 17.1811
4.1646 8.0 10856 3.6671 21.8164 7.5705 19.6207 17.2550
4.1399 9.0 12213 3.6602 21.9381 7.6577 19.7089 17.3014
4.1479 10.0 13570 3.6583 21.9729 7.6735 19.7497 17.3098

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu118
  • Datasets 2.14.5
  • Tokenizers 0.14.1
Downloads last month
13
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for xjlulu/ntu_adl_summarization_mt5_s

Base model

google/mt5-small
Finetuned
(306)
this model

Dataset used to train xjlulu/ntu_adl_summarization_mt5_s

Space using xjlulu/ntu_adl_summarization_mt5_s 1