Edit model card

mt5-base-ainu

This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1105
  • Bleu: 37.4939

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.06
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Bleu
2.1267 1.0 9341 1.8026 20.8450
1.6408 2.0 18682 1.4706 26.7109
1.4098 3.0 28023 1.3494 30.7048
1.2546 4.0 37364 1.2910 32.5056
1.124 5.0 46705 1.2617 33.7060
1.0048 6.0 56046 1.2578 34.5824
0.8872 7.0 65387 1.2639 35.1029
0.8103 8.0 74728 1.2955 35.7998
0.7298 9.0 84069 1.3284 35.8310
0.6494 10.0 93410 1.3780 36.3268
0.5696 11.0 102751 1.4343 36.2494
0.5148 12.0 112092 1.4957 36.8702
0.4487 13.0 121433 1.5511 36.8981
0.3941 14.0 130774 1.6235 36.8809
0.3432 15.0 140115 1.6957 37.0269
0.3023 16.0 149456 1.7935 37.1839
0.2614 17.0 158797 1.8619 37.1935
0.2267 18.0 168138 1.9485 37.4170
0.1996 19.0 177479 2.0348 37.3585
0.1746 20.0 186820 2.1105 37.4939

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.1.2
  • Datasets 2.19.0
  • Tokenizers 0.19.1
Downloads last month
271
Safetensors
Model size
582M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for aynumosir/mt5-base-ainu

Base model

google/mt5-base
Finetuned
(158)
this model