yesj1234
/

mbart-mmt_mid3_en-ko

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

mbartLarge_mid_en-ko1

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.4106
Bleu: 13.2758
Gen Len: 16.235

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 2
total_train_batch_size: 32
total_eval_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 500
num_epochs: 40

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
1.5855	1.12	1500	1.5215	11.5186	16.204
1.4287	2.24	3000	1.4549	12.2855	16.1497
1.2937	3.37	4500	1.4250	12.6484	16.2152
1.2444	4.49	6000	1.4165	13.0063	16.0749
1.1335	5.61	7500	1.4106	13.2758	16.235
1.0508	6.73	9000	1.4243	13.0601	15.86
0.9462	7.86	10500	1.4497	13.0828	16.0475
0.8464	8.98	12000	1.4692	13.5878	15.9308
0.6995	10.1	13500	1.5572	13.1085	15.9906

Framework versions

Transformers 4.34.0
Pytorch 2.1.0+cu121
Datasets 2.14.5
Tokenizers 0.14.1

Downloads last month: 3

Inference Providers NEW

Text2Text Generation

This model is not currently available via any of the supported Inference Providers.

Model tree for yesj1234/mbart-mmt_mid3_en-ko

Base model

facebook/mbart-large-50-many-to-many-mmt

Finetuned

(127)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard