Word-selector / README.md
zera09's picture
End of training
9103a61 verified
|
raw
history blame
3.46 kB
metadata
license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: Word-selector
    results: []

Word-selector

This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1303
  • Rouge1: 0.3216
  • Rouge2: 0.0621
  • Rougel: 0.2469
  • Rougelsum: 0.2469
  • Gen Len: 48.8488

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 12
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 400 2.2490 0.1594 0.0161 0.1319 0.1321 69.2094
2.7083 2.0 800 2.1665 0.2025 0.0287 0.1648 0.1647 69.5888
2.369 3.0 1200 2.1296 0.2381 0.0344 0.1878 0.1878 57.9775
2.2185 4.0 1600 2.0890 0.2525 0.0399 0.1986 0.1984 60.2588
2.1014 5.0 2000 2.0731 0.2795 0.0484 0.2199 0.2199 49.5737
2.1014 6.0 2400 2.0601 0.2862 0.0525 0.2249 0.2246 54.4206
1.9992 7.0 2800 2.0592 0.3004 0.0533 0.2351 0.2351 49.9325
1.9232 8.0 3200 2.0529 0.3033 0.0558 0.2366 0.2368 49.8744
1.8534 9.0 3600 2.0600 0.3024 0.0573 0.2366 0.2366 50.355
1.795 10.0 4000 2.0715 0.3082 0.0561 0.2392 0.2392 47.2162
1.795 11.0 4400 2.0657 0.3137 0.0595 0.2437 0.2439 50.3438
1.73 12.0 4800 2.0759 0.3142 0.0597 0.2434 0.2433 51.1619
1.6844 13.0 5200 2.0818 0.3172 0.0605 0.2458 0.2458 48.9956
1.6398 14.0 5600 2.0942 0.3149 0.0599 0.2428 0.243 47.3812
1.6063 15.0 6000 2.1047 0.3171 0.0609 0.243 0.243 51.685
1.6063 16.0 6400 2.1095 0.3234 0.0622 0.248 0.248 50.1588
1.5659 17.0 6800 2.1180 0.3212 0.0627 0.2479 0.2478 49.0894
1.5456 18.0 7200 2.1212 0.3208 0.0616 0.2455 0.2456 48.8688
1.5177 19.0 7600 2.1275 0.3214 0.0628 0.2467 0.2467 48.4125
1.5161 20.0 8000 2.1303 0.3216 0.0621 0.2469 0.2469 48.8488

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.1.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.15.1