mT5_base / README.md
meoo225's picture
End of training
0c7d21d verified
|
raw
history blame
1.9 kB
metadata
library_name: transformers
license: apache-2.0
base_model: google/mt5-small
tags:
  - generated_from_trainer
metrics:
  - precision
  - recall
model-index:
  - name: mT5_base
    results: []

mT5_base

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3417
  • Bleu Score: 47.0526
  • Precision: 17.2043
  • Recall: 17.2043
  • Gen Len: 16.8315
  • Err: 17.2043

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 4

Training results

Training Loss Epoch Step Validation Loss Bleu Score Precision Recall Gen Len Err
2.798 1.0 838 0.5495 41.8683 7.7658 7.7658 16.7766 7.7658
0.7216 2.0 1676 0.4311 44.9002 13.0227 13.0227 16.8148 13.0227
0.5551 3.0 2514 0.3565 46.5247 16.0096 16.0096 16.816 16.0096
0.4951 4.0 3352 0.3417 47.0526 17.2043 17.2043 16.8315 17.2043

Framework versions

  • Transformers 4.45.1
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.0