--- license: apache-2.0 base_model: google/mt5-small tags: - generated_from_trainer metrics: - rouge model-index: - name: mt5-summarize results: [] --- # mt5-summarize This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 3.1534 - Rouge1: 0.3153 - Rouge2: 0.1594 - Rougel: 0.2511 - Rougelsum: 0.3397 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0005 - train_batch_size: 2 - eval_batch_size: 2 - seed: 42 - gradient_accumulation_steps: 16 - total_train_batch_size: 32 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 90 - num_epochs: 10 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | |:-------------:|:------:|:----:|:---------------:|:------:|:------:|:------:|:---------:| | 4.3806 | 1.0667 | 100 | 3.4709 | 0.2568 | 0.1258 | 0.2214 | 0.2662 | | 3.7757 | 2.1333 | 200 | 3.2899 | 0.2759 | 0.1388 | 0.2381 | 0.2946 | | 3.5195 | 3.2 | 300 | 3.1951 | 0.2951 | 0.1523 | 0.2466 | 0.3217 | | 3.4319 | 4.2667 | 400 | 3.1715 | 0.2813 | 0.1323 | 0.2331 | 0.3011 | | 3.2402 | 5.3333 | 500 | 3.1704 | 0.3058 | 0.1548 | 0.2513 | 0.3366 | | 3.2313 | 6.4 | 600 | 3.1657 | 0.3077 | 0.1534 | 0.2461 | 0.3335 | | 3.1444 | 7.4667 | 700 | 3.1719 | 0.2957 | 0.1453 | 0.2378 | 0.3191 | | 3.116 | 8.5333 | 800 | 3.1639 | 0.3144 | 0.1540 | 0.2501 | 0.3453 | | 2.9937 | 9.6 | 900 | 3.1534 | 0.3153 | 0.1594 | 0.2511 | 0.3397 | ### Framework versions - Transformers 4.42.3 - Pytorch 2.3.0+cu121 - Datasets 2.20.0 - Tokenizers 0.19.1