metadata

license: apache-2.0
base_model: google/mt5-large
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: cs_mT5-large2_0.01_50_v0.1
    results: []

cs_mT5-large2_0.01_50_v0.1

This model is a fine-tuned version of google/mt5-large on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 7.9807
Bleu: 0.6899
Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.01
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
2.6703	1.0	6	8.3163	0.0	19.0
3.3427	2.0	12	6.5085	0.5004	19.0
2.8652	3.0	18	7.0200	0.6899	19.0
2.9454	4.0	24	7.3333	0.2191	19.0
2.7918	5.0	30	7.5745	0.4671	12.0
3.5645	6.0	36	6.3676	0.0	19.0
3.0885	7.0	42	7.0359	0.6908	19.0
3.5374	8.0	48	6.8709	0.1154	12.3333
3.3746	9.0	54	6.4090	0.0	19.0
2.5927	10.0	60	6.7357	0.6381	19.0
2.581	11.0	66	6.5953	0.2635	19.0
4.1786	12.0	72	7.8617	0.2068	19.0
2.8545	13.0	78	6.2553	0.1628	19.0
2.8925	14.0	84	6.6297	0.612	19.0
3.2424	15.0	90	7.0312	0.7377	19.0
2.379	16.0	96	6.9121	0.6562	19.0
2.2356	17.0	102	6.9446	0.2759	15.0
3.0548	18.0	108	7.5770	0.1529	19.0
2.4637	19.0	114	7.1444	0.4497	19.0
2.96	20.0	120	6.8181	0.3779	11.0
2.2016	21.0	126	6.6893	0.6562	19.0
1.9774	22.0	132	7.3802	0.3807	19.0
1.6734	23.0	138	6.7319	0.5405	19.0
3.1958	24.0	144	7.1645	1.2379	19.0
3.1363	25.0	150	7.7097	0.3794	19.0
2.4353	26.0	156	6.9324	0.2522	14.0
2.8675	27.0	162	6.7989	0.1488	19.0
1.7486	28.0	168	7.1052	0.7123	19.0
2.775	29.0	174	7.0195	0.7393	19.0
1.8752	30.0	180	6.9133	0.2119	19.0
1.7576	31.0	186	7.2143	0.2641	19.0
2.2793	32.0	192	7.0029	1.1166	19.0
1.98	33.0	198	6.9954	0.5348	19.0
1.4242	34.0	204	7.5163	0.2088	19.0
2.413	35.0	210	7.0622	0.1433	19.0
1.2191	36.0	216	7.0088	0.5307	12.0
1.5944	37.0	222	7.7706	0.1948	19.0
1.0044	38.0	228	7.7163	0.8485	18.4286
1.4428	39.0	234	7.4919	0.6033	19.0
3.0175	40.0	240	7.4158	0.5109	19.0
1.3632	41.0	246	6.9819	0.4326	11.0
1.8384	42.0	252	7.0156	0.6215	18.2381
1.3237	43.0	258	7.2082	0.4826	19.0
1.1516	44.0	264	7.5088	0.4745	18.8571
1.3893	45.0	270	7.7298	0.4527	19.0
1.0125	46.0	276	7.8458	0.832	14.0
0.8954	47.0	282	7.9101	0.7754	17.5714
1.8111	48.0	288	7.9713	0.6899	19.0
1.2008	49.0	294	7.9821	0.6899	19.0
1.5131	50.0	300	7.9807	0.6899	19.0

Framework versions

Transformers 4.35.2
Pytorch 1.13.1+cu117
Datasets 2.17.0
Tokenizers 0.15.2