opus-mt-cantonese-v1

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-zh. It achieves the following results on the evaluation set:

Loss: 3.7970
Bleu: 1.5351
Gen Len: 12.6626

** Check out version 2 which has more training data.**

Model description

This model translates English into Cantonese.

Intended uses & limitations

Translations produced are for experimental purposes. Use at your own risk.

Training and evaluation data

Trained with 1232 Cantonese sentences with English translations from CantoDict.

Training procedure

80% training/20% validation. 120 epochs.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-06
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 120
mixed_precision_training: Native AMP

Training results

Validation Loss went down as low as 3.7154 and came back up. Overfitting.

(Last 60 epochs)

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
No log	1.0	62	3.7171	1.4013	12.6341
No log	2.0	124	3.7238	1.4412	12.7154
No log	3.0	186	3.7223	1.4767	12.6098
No log	4.0	248	3.7239	1.4567	12.6585
No log	5.0	310	3.7230	1.4797	12.6707
No log	6.0	372	3.7228	1.4804	12.6585
No log	7.0	434	3.7244	1.4968	12.5976
No log	8.0	496	3.7268	1.4797	12.6057
1.8247	9.0	558	3.7248	1.486	12.6911
1.8247	10.0	620	3.7292	1.4833	12.6829
1.8247	11.0	682	3.7276	1.4767	12.687
1.8247	12.0	744	3.7321	1.454	12.8374
1.8247	13.0	806	3.7357	1.467	12.8211
1.8247	14.0	868	3.7373	1.4605	12.8171
1.8247	15.0	930	3.7369	1.4793	12.748
1.8247	16.0	992	3.7385	1.4626	12.8984
1.6597	17.0	1054	3.7406	1.4831	12.8089
1.6597	18.0	1116	3.7439	1.513	12.7846
1.6597	19.0	1178	3.7423	1.514	12.6545
1.6597	20.0	1240	3.7485	1.4928	12.8659
1.6597	21.0	1302	3.7493	1.5506	12.7033
1.6597	22.0	1364	3.7544	1.5185	12.7439
1.6597	23.0	1426	3.7558	1.4922	12.8049
1.6597	24.0	1488	3.7589	1.4803	12.7683
1.5288	25.0	1550	3.7586	1.5488	12.7642
1.5288	26.0	1612	3.7591	1.5345	12.748
1.5288	27.0	1674	3.7615	1.5416	12.7805
1.5288	28.0	1736	3.7646	1.5416	12.752
1.5288	29.0	1798	3.7665	1.5377	12.7683
1.5288	30.0	1860	3.7670	1.515	12.7561
1.5288	31.0	1922	3.7680	1.515	12.7846
1.5288	32.0	1984	3.7705	1.5181	12.7236
1.425	33.0	2046	3.7717	1.502	12.7236
1.425	34.0	2108	3.7741	1.5461	12.6992
1.425	35.0	2170	3.7781	1.4945	12.7561
1.425	36.0	2232	3.7790	1.5391	12.748
1.425	37.0	2294	3.7818	1.5798	12.7154
1.425	38.0	2356	3.7827	1.5653	12.7276
1.425	39.0	2418	3.7833	1.525	12.7195
1.425	40.0	2480	3.7853	1.522	12.752
1.3476	41.0	2542	3.7875	1.522	12.7195
1.3476	42.0	2604	3.7880	1.4983	12.7276
1.3476	43.0	2666	3.7891	1.5532	12.752
1.3476	44.0	2728	3.7896	1.5532	12.7398
1.3476	45.0	2790	3.7915	1.5013	12.7439
1.3476	46.0	2852	3.7933	1.5051	12.7358
1.3476	47.0	2914	3.7921	1.5013	12.7195
1.3476	48.0	2976	3.7922	1.5081	12.7073
1.3068	49.0	3038	3.7928	1.5081	12.7033
1.3068	50.0	3100	3.7935	1.5043	12.7073
1.3068	51.0	3162	3.7939	1.5043	12.7073
1.3068	52.0	3224	3.7951	1.5051	12.7154
1.3068	53.0	3286	3.7947	1.5351	12.6707
1.3068	54.0	3348	3.7951	1.5382	12.6667
1.3068	55.0	3410	3.7954	1.5351	12.6748
1.3068	56.0	3472	3.7958	1.5351	12.6748
1.279	57.0	3534	3.7962	1.5281	12.6545
1.279	58.0	3596	3.7967	1.5281	12.6545
1.279	59.0	3658	3.7969	1.5351	12.6626
1.279	60.0	3720	3.7970	1.5351	12.6626

Framework versions

Transformers 4.39.2
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2

edwinlaw
/

opus-mt-cantonese-v1