t5-base-snl / README.md
navjordj's picture
Update README.md
f46464b
metadata
license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: t5-base-snl
    results: []
inference:
  parameters:
    max_length: 160

t5-base-snl

This model is a fine-tuned version of north/t5_base_NCC_lm on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0574
  • Rouge1: 29.7694
  • Rouge2: 15.6776
  • Rougel: 27.3556
  • Rougelsum: 28.4819
  • Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.9943 1.0 170 2.2042 28.1135 13.7477 25.4842 26.6467 18.9768
2.7955 2.0 340 2.1561 28.5159 14.3492 26.0596 27.2431 18.9853
2.6378 3.0 510 2.1310 28.9554 14.6901 26.4208 27.5523 18.9915
2.5962 4.0 680 2.1110 29.381 15.1503 26.8406 27.9653 18.9915
2.5369 5.0 850 2.1020 29.5767 15.2692 27.0113 28.1849 18.9963
2.5103 6.0 1020 2.0907 29.6354 15.434 27.0893 28.2703 18.9963
2.4524 7.0 1190 2.0840 29.7812 15.4963 27.2779 28.385 18.9963
2.4472 8.0 1360 2.0800 29.6011 15.5138 27.1381 28.2799 18.9963
2.4089 9.0 1530 2.0752 29.7647 15.6183 27.318 28.4747 18.9963
2.4011 10.0 1700 2.0710 29.6533 15.5536 27.2687 28.4457 19.0
2.3792 11.0 1870 2.0656 29.8668 15.6931 27.4208 28.5477 19.0
2.3588 12.0 2040 2.0635 29.8378 15.682 27.4635 28.5803 18.9963
2.3397 13.0 2210 2.0630 29.9043 15.7535 27.5065 28.6539 19.0
2.3201 14.0 2380 2.0600 29.7926 15.7077 27.4066 28.5302 18.9963
2.3241 15.0 2550 2.0615 29.8536 15.7929 27.4572 28.5704 19.0
2.3183 16.0 2720 2.0574 29.7529 15.6729 27.3388 28.4678 19.0
2.3346 17.0 2890 2.0571 29.7443 15.6459 27.3245 28.4549 19.0
2.2932 18.0 3060 2.0577 29.7467 15.6717 27.3391 28.4541 19.0
2.2755 19.0 3230 2.0574 29.7694 15.6776 27.3556 28.4819 19.0

Framework versions

  • Transformers 4.27.0.dev0
  • Pytorch 1.13.1
  • Datasets 2.10.1
  • Tokenizers 0.13.2