Edit model card

t5-base-finetuned-ancient_chinese-to-modern_chinese

This model is a fine-tuned version of t5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1221
  • Bleu: 84.7874
  • Gen Len: 7.4143

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.1833 1.0 716 0.1371 83.007 7.5431
0.1528 2.0 1432 0.1286 84.1978 7.4289
0.1414 3.0 2148 0.1279 84.8682 7.4034
0.131 4.0 2864 0.1252 84.6009 7.4209
0.1298 5.0 3580 0.1250 84.7541 7.4146
0.1325 6.0 4296 0.1233 85.0001 7.4097
0.1284 7.0 5012 0.1235 84.7152 7.4122
0.1315 8.0 5728 0.1232 85.2833 7.4097
0.1276 9.0 6444 0.1231 84.7562 7.4104
0.1259 10.0 7160 0.1226 84.684 7.4139
0.1259 11.0 7876 0.1216 84.8757 7.4129
0.1257 12.0 8592 0.1221 84.6458 7.4143
0.1233 13.0 9308 0.1220 84.8371 7.4122
0.1217 14.0 10024 0.1218 84.7984 7.4115
0.1253 15.0 10740 0.1221 84.7874 7.4143

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
223M params
Tensor type
F32
·
Inference API
This model can be loaded on Inference API (serverless).

Finetuned from