psxjp5
/

mt5-small_mid_lr_mid_decay

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

mt5-small_mid_lr_mid_decay / README.md

psxjp5's picture

update model card README.md

e39576f over 1 year ago

|

history blame contribute delete

3.12 kB

	---
	license: apache-2.0
	base_model: google/mt5-small
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	- bleu
	model-index:
	- name: mt5-small_mid_lr_mid_decay
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# mt5-small_mid_lr_mid_decay

	This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.7428
	- Rouge1: 43.12
	- Rouge2: 37.6639
	- Rougel: 41.8367
	- Rougelsum: 41.904
	- Bleu: 31.957
	- Gen Len: 12.1285
	- Meteor: 0.3936
	- No ans accuracy: 22.29
	- Av cosine sim: 0.7406

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.001
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 9
	- gradient_accumulation_steps: 8
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 20

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Bleu \| Gen Len \| Meteor \| No ans accuracy \| Av cosine sim \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:-------:\|:-------:\|:------:\|:---------------:\|:-------------:\|
	\| 3.1455 \| 1.0 \| 175 \| 0.9832 \| 18.7107 \| 15.4897 \| 18.1977 \| 18.2212 \| 7.0634 \| 7.6229 \| 0.1626 \| 22.4000 \| 0.3949 \|
	\| 1.1623 \| 1.99 \| 350 \| 0.8542 \| 38.7675 \| 32.704 \| 37.3557 \| 37.3949 \| 27.4323 \| 12.5135 \| 0.3487 \| 17.9900 \| 0.6992 \|
	\| 0.9431 \| 2.99 \| 525 \| 0.8017 \| 41.6216 \| 35.6002 \| 40.2386 \| 40.2881 \| 30.7994 \| 12.8117 \| 0.3755 \| 18.37 \| 0.7304 \|
	\| 0.8119 \| 3.98 \| 700 \| 0.7787 \| 43.5805 \| 37.4117 \| 42.1059 \| 42.155 \| 32.9646 \| 13.2176 \| 0.3947 \| 17.7400 \| 0.7582 \|
	\| 0.7235 \| 4.98 \| 875 \| 0.7477 \| 43.4124 \| 37.2017 \| 41.8468 \| 41.9097 \| 32.9345 \| 13.116 \| 0.3946 \| 18.92 \| 0.7561 \|
	\| 0.6493 \| 5.97 \| 1050 \| 0.7266 \| 40.4764 \| 34.9927 \| 39.0999 \| 39.1711 \| 29.0601 \| 11.748 \| 0.3687 \| 22.6500 \| 0.7071 \|
	\| 0.5871 \| 6.97 \| 1225 \| 0.7284 \| 43.3812 \| 37.5544 \| 42.0405 \| 42.0865 \| 32.8345 \| 12.6063 \| 0.3949 \| 21.05 \| 0.7485 \|
	\| 0.5453 \| 7.96 \| 1400 \| 0.7389 \| 43.4549 \| 37.76 \| 42.1025 \| 42.215 \| 32.6726 \| 12.4537 \| 0.3965 \| 21.44 \| 0.7496 \|
	\| 0.5038 \| 8.96 \| 1575 \| 0.7428 \| 43.12 \| 37.6639 \| 41.8367 \| 41.904 \| 31.957 \| 12.1285 \| 0.3936 \| 22.29 \| 0.7406 \|


	### Framework versions

	- Transformers 4.31.0
	- Pytorch 2.0.1+cu118
	- Datasets 2.13.1
	- Tokenizers 0.13.3