emilstabil
/

mt5-large_V8901

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

mt5-large_V8901

This model is a fine-tuned version of google/mt5-large on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.7406
Rouge1: 19.3197
Rouge2: 4.1196
Rougel: 10.4848
Rougelsum: 18.1364
Gen Len: 547.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 11

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
12.1229	2.11	500	3.9460	15.5693	2.3792	9.3813	14.6801	547.0
5.6736	4.21	1000	3.1634	17.0598	3.1505	9.4845	15.8919	547.0
3.84	6.32	1500	2.0725	25.9095	8.9258	13.9444	24.3476	547.0
3.6041	8.42	2000	2.7824	19.0967	3.9799	10.5463	17.8219	547.0
3.3024	10.53	2500	2.7406	19.3197	4.1196	10.4848	18.1364	547.0

Framework versions

Transformers 4.30.2
Pytorch 1.12.1+git7548e2f
Datasets 2.13.2
Tokenizers 0.13.3

Downloads last month: 21

Inference Providers NEW

Text2Text Generation

This model is not currently available via any of the supported Inference Providers.

Evaluation results

Metadata error: specify a dataset to view leaderboard