edwinlaw's picture
End of training
dcfa0f9 verified
|
raw
history blame
4.9 kB
metadata
license: apache-2.0
base_model: Helsinki-NLP/opus-mt-en-zh
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: opus-mt-cantonese-v1
    results: []

opus-mt-cantonese-v1

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-zh on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.7189
  • Bleu: 1.3095
  • Gen Len: 12.8089

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-06
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 62 4.0260 0.5212 12.1748
No log 2.0 124 3.9917 0.5398 12.2033
No log 3.0 186 3.9573 0.923 12.0894
No log 4.0 248 3.9330 0.9257 12.252
No log 5.0 310 3.9073 0.9197 12.2154
No log 6.0 372 3.8840 0.9586 12.2561
No log 7.0 434 3.8681 0.9702 12.3374
No log 8.0 496 3.8540 0.9676 12.3415
3.0832 9.0 558 3.8380 0.9564 12.4268
3.0832 10.0 620 3.8276 0.963 12.5081
3.0832 11.0 682 3.8159 0.9326 12.5528
3.0832 12.0 744 3.8086 0.9326 12.5772
3.0832 13.0 806 3.8007 0.9668 12.5813
3.0832 14.0 868 3.7919 0.922 12.7073
3.0832 15.0 930 3.7833 0.9319 12.626
3.0832 16.0 992 3.7754 1.0907 12.7033
2.6953 17.0 1054 3.7698 1.0914 12.7317
2.6953 18.0 1116 3.7657 1.1198 12.7642
2.6953 19.0 1178 3.7597 1.2304 12.6707
2.6953 20.0 1240 3.7555 1.2345 12.7683
2.6953 21.0 1302 3.7519 1.2465 12.7439
2.6953 22.0 1364 3.7506 1.2322 12.7764
2.6953 23.0 1426 3.7480 1.2558 12.7642
2.6953 24.0 1488 3.7453 1.2465 12.7317
2.4546 25.0 1550 3.7415 1.2614 12.7073
2.4546 26.0 1612 3.7377 1.2339 12.7073
2.4546 27.0 1674 3.7346 1.2664 12.7195
2.4546 28.0 1736 3.7315 1.2664 12.7195
2.4546 29.0 1798 3.7310 1.3041 12.7033
2.4546 30.0 1860 3.7293 1.2715 12.687
2.4546 31.0 1922 3.7266 1.2941 12.6748
2.4546 32.0 1984 3.7266 1.2988 12.7398
2.2894 33.0 2046 3.7260 1.3227 12.7439
2.2894 34.0 2108 3.7243 1.3227 12.752
2.2894 35.0 2170 3.7240 1.3227 12.752
2.2894 36.0 2232 3.7230 1.3338 12.7276
2.2894 37.0 2294 3.7242 1.3338 12.7724
2.2894 38.0 2356 3.7224 1.3338 12.7764
2.2894 39.0 2418 3.7210 1.3338 12.7642
2.2894 40.0 2480 3.7214 1.351 12.7642
2.1784 41.0 2542 3.7215 1.3283 12.7967
2.1784 42.0 2604 3.7208 1.3173 12.7642
2.1784 43.0 2666 3.7208 1.3519 12.7114
2.1784 44.0 2728 3.7200 1.3519 12.7114
2.1784 45.0 2790 3.7198 1.3173 12.7886
2.1784 46.0 2852 3.7199 1.3519 12.752
2.1784 47.0 2914 3.7192 1.3576 12.7642
2.1784 48.0 2976 3.7191 1.3282 12.7967
2.1365 49.0 3038 3.7189 1.3282 12.8089
2.1365 50.0 3100 3.7189 1.3095 12.8089

Framework versions

  • Transformers 4.39.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2