edwinlaw's picture
Update README.md
2cee741 verified
metadata
license: apache-2.0
base_model: Helsinki-NLP/opus-mt-en-zh
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: opus-mt-cantonese-v1
    results: []

opus-mt-cantonese-v1

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-zh. It achieves the following results on the evaluation set:

  • Loss: 3.7970
  • Bleu: 1.5351
  • Gen Len: 12.6626

** Check out version 2 which has more training data.**

Model description

This model translates English into Cantonese.

Intended uses & limitations

Translations produced are for experimental purposes. Use at your own risk.

Training and evaluation data

Trained with 1232 Cantonese sentences with English translations from CantoDict.

Training procedure

80% training/20% validation. 120 epochs.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-06
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 120
  • mixed_precision_training: Native AMP

Training results

Validation Loss went down as low as 3.7154 and came back up. Overfitting.

(Last 60 epochs)

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 62 3.7171 1.4013 12.6341
No log 2.0 124 3.7238 1.4412 12.7154
No log 3.0 186 3.7223 1.4767 12.6098
No log 4.0 248 3.7239 1.4567 12.6585
No log 5.0 310 3.7230 1.4797 12.6707
No log 6.0 372 3.7228 1.4804 12.6585
No log 7.0 434 3.7244 1.4968 12.5976
No log 8.0 496 3.7268 1.4797 12.6057
1.8247 9.0 558 3.7248 1.486 12.6911
1.8247 10.0 620 3.7292 1.4833 12.6829
1.8247 11.0 682 3.7276 1.4767 12.687
1.8247 12.0 744 3.7321 1.454 12.8374
1.8247 13.0 806 3.7357 1.467 12.8211
1.8247 14.0 868 3.7373 1.4605 12.8171
1.8247 15.0 930 3.7369 1.4793 12.748
1.8247 16.0 992 3.7385 1.4626 12.8984
1.6597 17.0 1054 3.7406 1.4831 12.8089
1.6597 18.0 1116 3.7439 1.513 12.7846
1.6597 19.0 1178 3.7423 1.514 12.6545
1.6597 20.0 1240 3.7485 1.4928 12.8659
1.6597 21.0 1302 3.7493 1.5506 12.7033
1.6597 22.0 1364 3.7544 1.5185 12.7439
1.6597 23.0 1426 3.7558 1.4922 12.8049
1.6597 24.0 1488 3.7589 1.4803 12.7683
1.5288 25.0 1550 3.7586 1.5488 12.7642
1.5288 26.0 1612 3.7591 1.5345 12.748
1.5288 27.0 1674 3.7615 1.5416 12.7805
1.5288 28.0 1736 3.7646 1.5416 12.752
1.5288 29.0 1798 3.7665 1.5377 12.7683
1.5288 30.0 1860 3.7670 1.515 12.7561
1.5288 31.0 1922 3.7680 1.515 12.7846
1.5288 32.0 1984 3.7705 1.5181 12.7236
1.425 33.0 2046 3.7717 1.502 12.7236
1.425 34.0 2108 3.7741 1.5461 12.6992
1.425 35.0 2170 3.7781 1.4945 12.7561
1.425 36.0 2232 3.7790 1.5391 12.748
1.425 37.0 2294 3.7818 1.5798 12.7154
1.425 38.0 2356 3.7827 1.5653 12.7276
1.425 39.0 2418 3.7833 1.525 12.7195
1.425 40.0 2480 3.7853 1.522 12.752
1.3476 41.0 2542 3.7875 1.522 12.7195
1.3476 42.0 2604 3.7880 1.4983 12.7276
1.3476 43.0 2666 3.7891 1.5532 12.752
1.3476 44.0 2728 3.7896 1.5532 12.7398
1.3476 45.0 2790 3.7915 1.5013 12.7439
1.3476 46.0 2852 3.7933 1.5051 12.7358
1.3476 47.0 2914 3.7921 1.5013 12.7195
1.3476 48.0 2976 3.7922 1.5081 12.7073
1.3068 49.0 3038 3.7928 1.5081 12.7033
1.3068 50.0 3100 3.7935 1.5043 12.7073
1.3068 51.0 3162 3.7939 1.5043 12.7073
1.3068 52.0 3224 3.7951 1.5051 12.7154
1.3068 53.0 3286 3.7947 1.5351 12.6707
1.3068 54.0 3348 3.7951 1.5382 12.6667
1.3068 55.0 3410 3.7954 1.5351 12.6748
1.3068 56.0 3472 3.7958 1.5351 12.6748
1.279 57.0 3534 3.7962 1.5281 12.6545
1.279 58.0 3596 3.7967 1.5281 12.6545
1.279 59.0 3658 3.7969 1.5351 12.6626
1.279 60.0 3720 3.7970 1.5351 12.6626

Framework versions

  • Transformers 4.39.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2