End of training

3d23083 about 1 year ago

4.22 kB

	---
	license: apache-2.0
	base_model: google/flan-t5-base
	tags:
	- generated_from_trainer
	datasets:
	- samsum
	metrics:
	- rouge
	model-index:
	- name: flan-t5-base-samsum
	results:
	- task:
	name: Sequence-to-sequence Language Modeling
	type: text2text-generation
	dataset:
	name: samsum
	type: samsum
	config: samsum
	split: test
	args: samsum
	metrics:
	- name: Rouge1
	type: rouge
	value: 47.0919
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# flan-t5-base-samsum

	This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the samsum dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.3859
	- Rouge1: 47.0919
	- Rouge2: 23.2123
	- Rougel: 39.2407
	- Rougelsum: 43.2174
	- Gen Len: 17.3333

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 24
	- eval_batch_size: 24
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 2

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:-------:\|
	\| 1.5121 \| 0.08 \| 50 \| 1.4287 \| 46.7806 \| 22.8207 \| 38.9302 \| 42.7835 \| 16.9634 \|
	\| 1.46 \| 0.16 \| 100 \| 1.4199 \| 46.826 \| 22.7844 \| 39.0295 \| 42.8573 \| 17.2393 \|
	\| 1.4515 \| 0.24 \| 150 \| 1.4147 \| 46.6646 \| 22.9602 \| 38.9391 \| 42.8187 \| 17.1245 \|
	\| 1.4679 \| 0.33 \| 200 \| 1.4121 \| 46.8291 \| 22.7922 \| 39.1404 \| 43.1542 \| 17.3431 \|
	\| 1.451 \| 0.41 \| 250 \| 1.4109 \| 46.8103 \| 23.0066 \| 39.2832 \| 43.2411 \| 17.2686 \|
	\| 1.4434 \| 0.49 \| 300 \| 1.4040 \| 46.6321 \| 22.989 \| 39.3016 \| 43.0997 \| 16.9158 \|
	\| 1.4417 \| 0.57 \| 350 \| 1.4007 \| 46.8538 \| 22.9937 \| 39.2135 \| 43.1728 \| 17.1172 \|
	\| 1.4781 \| 0.65 \| 400 \| 1.3952 \| 46.8055 \| 23.036 \| 39.2961 \| 43.1755 \| 17.2076 \|
	\| 1.4626 \| 0.73 \| 450 \| 1.3940 \| 47.0996 \| 23.2205 \| 39.3007 \| 43.2286 \| 17.2222 \|
	\| 1.4307 \| 0.81 \| 500 \| 1.3955 \| 46.8877 \| 23.1402 \| 39.2634 \| 43.1279 \| 17.2002 \|
	\| 1.4586 \| 0.9 \| 550 \| 1.3933 \| 46.7191 \| 23.1291 \| 39.2437 \| 43.1183 \| 17.3040 \|
	\| 1.4465 \| 0.98 \| 600 \| 1.3905 \| 46.8651 \| 23.29 \| 39.2514 \| 43.2025 \| 17.3468 \|
	\| 1.381 \| 1.06 \| 650 \| 1.3953 \| 46.9166 \| 22.9547 \| 39.0439 \| 43.1589 \| 17.4066 \|
	\| 1.4125 \| 1.14 \| 700 \| 1.3922 \| 46.5286 \| 23.0552 \| 38.9056 \| 42.9298 \| 17.2381 \|
	\| 1.3667 \| 1.22 \| 750 \| 1.3922 \| 47.3239 \| 23.3549 \| 39.4725 \| 43.518 \| 17.2930 \|
	\| 1.3878 \| 1.3 \| 800 \| 1.3953 \| 46.6837 \| 23.1602 \| 39.2578 \| 43.2195 \| 17.3358 \|
	\| 1.3884 \| 1.38 \| 850 \| 1.3931 \| 46.9537 \| 23.0894 \| 39.1676 \| 43.1687 \| 17.3614 \|
	\| 1.3766 \| 1.47 \| 900 \| 1.3898 \| 46.9996 \| 23.1407 \| 39.2222 \| 43.237 \| 17.3333 \|
	\| 1.3727 \| 1.55 \| 950 \| 1.3889 \| 46.6936 \| 23.0454 \| 39.0579 \| 42.9472 \| 17.3211 \|
	\| 1.4001 \| 1.63 \| 1000 \| 1.3859 \| 47.0919 \| 23.2123 \| 39.2407 \| 43.2174 \| 17.3333 \|
	\| 1.3894 \| 1.71 \| 1050 \| 1.3874 \| 47.2229 \| 23.35 \| 39.4333 \| 43.4876 \| 17.3297 \|
	\| 1.3697 \| 1.79 \| 1100 \| 1.3860 \| 47.0872 \| 23.3503 \| 39.3371 \| 43.3444 \| 17.3504 \|
	\| 1.3886 \| 1.87 \| 1150 \| 1.3862 \| 47.0516 \| 23.3487 \| 39.3653 \| 43.3272 \| 17.3260 \|
	\| 1.4037 \| 1.95 \| 1200 \| 1.3861 \| 47.05 \| 23.3672 \| 39.3131 \| 43.3233 \| 17.3321 \|


	### Framework versions

	- Transformers 4.33.2
	- Pytorch 2.0.0+cu117
	- Datasets 2.14.5
	- Tokenizers 0.13.3