metadata

license: apache-2.0
base_model: Helsinki-NLP/opus-mt-en-zh
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: opus-mt-cantonese-v1
    results: []

opus-mt-cantonese-v1

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-zh on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.7189
Bleu: 1.3095
Gen Len: 12.8089

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-06
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
No log	1.0	62	4.0260	0.5212	12.1748
No log	2.0	124	3.9917	0.5398	12.2033
No log	3.0	186	3.9573	0.923	12.0894
No log	4.0	248	3.9330	0.9257	12.252
No log	5.0	310	3.9073	0.9197	12.2154
No log	6.0	372	3.8840	0.9586	12.2561
No log	7.0	434	3.8681	0.9702	12.3374
No log	8.0	496	3.8540	0.9676	12.3415
3.0832	9.0	558	3.8380	0.9564	12.4268
3.0832	10.0	620	3.8276	0.963	12.5081
3.0832	11.0	682	3.8159	0.9326	12.5528
3.0832	12.0	744	3.8086	0.9326	12.5772
3.0832	13.0	806	3.8007	0.9668	12.5813
3.0832	14.0	868	3.7919	0.922	12.7073
3.0832	15.0	930	3.7833	0.9319	12.626
3.0832	16.0	992	3.7754	1.0907	12.7033
2.6953	17.0	1054	3.7698	1.0914	12.7317
2.6953	18.0	1116	3.7657	1.1198	12.7642
2.6953	19.0	1178	3.7597	1.2304	12.6707
2.6953	20.0	1240	3.7555	1.2345	12.7683
2.6953	21.0	1302	3.7519	1.2465	12.7439
2.6953	22.0	1364	3.7506	1.2322	12.7764
2.6953	23.0	1426	3.7480	1.2558	12.7642
2.6953	24.0	1488	3.7453	1.2465	12.7317
2.4546	25.0	1550	3.7415	1.2614	12.7073
2.4546	26.0	1612	3.7377	1.2339	12.7073
2.4546	27.0	1674	3.7346	1.2664	12.7195
2.4546	28.0	1736	3.7315	1.2664	12.7195
2.4546	29.0	1798	3.7310	1.3041	12.7033
2.4546	30.0	1860	3.7293	1.2715	12.687
2.4546	31.0	1922	3.7266	1.2941	12.6748
2.4546	32.0	1984	3.7266	1.2988	12.7398
2.2894	33.0	2046	3.7260	1.3227	12.7439
2.2894	34.0	2108	3.7243	1.3227	12.752
2.2894	35.0	2170	3.7240	1.3227	12.752
2.2894	36.0	2232	3.7230	1.3338	12.7276
2.2894	37.0	2294	3.7242	1.3338	12.7724
2.2894	38.0	2356	3.7224	1.3338	12.7764
2.2894	39.0	2418	3.7210	1.3338	12.7642
2.2894	40.0	2480	3.7214	1.351	12.7642
2.1784	41.0	2542	3.7215	1.3283	12.7967
2.1784	42.0	2604	3.7208	1.3173	12.7642
2.1784	43.0	2666	3.7208	1.3519	12.7114
2.1784	44.0	2728	3.7200	1.3519	12.7114
2.1784	45.0	2790	3.7198	1.3173	12.7886
2.1784	46.0	2852	3.7199	1.3519	12.752
2.1784	47.0	2914	3.7192	1.3576	12.7642
2.1784	48.0	2976	3.7191	1.3282	12.7967
2.1365	49.0	3038	3.7189	1.3282	12.8089
2.1365	50.0	3100	3.7189	1.3095	12.8089

Framework versions

Transformers 4.39.2
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2