flant5action / README.md

Training in progress, epoch 1

d2a65c1 about 1 year ago

4.05 kB

	---
	license: apache-2.0
	base_model: google/flan-t5-base
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: flant5action
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# flant5action

	This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.1428
	- Rouge1: 56.0664
	- Rouge2: 34.7343
	- Rougel: 56.0394
	- Rougelsum: 56.0313
	- Gen Len: 18.9852

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 25

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:-------:\|
	\| 0.2525 \| 1.0 \| 674 \| 0.2294 \| 53.2181 \| 29.8509 \| 53.1635 \| 53.1474 \| 19.0 \|
	\| 0.2434 \| 2.0 \| 1348 \| 0.2240 \| 53.5453 \| 30.1367 \| 53.4479 \| 53.44 \| 18.9970 \|
	\| 0.2281 \| 3.0 \| 2022 \| 0.2135 \| 53.1901 \| 30.3456 \| 53.0849 \| 53.0759 \| 18.9970 \|
	\| 0.2221 \| 4.0 \| 2696 \| 0.2056 \| 52.0669 \| 29.4321 \| 51.9567 \| 51.9512 \| 18.9881 \|
	\| 0.2145 \| 5.0 \| 3370 \| 0.2012 \| 54.484 \| 31.6451 \| 54.4213 \| 54.4144 \| 18.9970 \|
	\| 0.2121 \| 6.0 \| 4044 \| 0.1961 \| 54.1219 \| 31.2019 \| 54.0701 \| 54.0668 \| 18.9970 \|
	\| 0.1979 \| 7.0 \| 4718 \| 0.1901 \| 54.9091 \| 32.2416 \| 54.8482 \| 54.8318 \| 18.9911 \|
	\| 0.2086 \| 8.0 \| 5392 \| 0.1846 \| 54.9615 \| 32.4701 \| 54.8836 \| 54.8821 \| 18.9970 \|
	\| 0.1985 \| 9.0 \| 6066 \| 0.1795 \| 55.2027 \| 32.5792 \| 55.1531 \| 55.1431 \| 18.9970 \|
	\| 0.2027 \| 10.0 \| 6740 \| 0.1746 \| 54.4079 \| 32.2598 \| 54.38 \| 54.3697 \| 18.9970 \|
	\| 0.1922 \| 11.0 \| 7414 \| 0.1707 \| 55.4814 \| 33.2069 \| 55.4428 \| 55.4298 \| 18.9970 \|
	\| 0.1806 \| 12.0 \| 8088 \| 0.1660 \| 55.7189 \| 33.831 \| 55.6796 \| 55.6702 \| 18.9970 \|
	\| 0.1834 \| 13.0 \| 8762 \| 0.1623 \| 55.6253 \| 33.9516 \| 55.5925 \| 55.585 \| 18.9941 \|
	\| 0.1795 \| 14.0 \| 9436 \| 0.1596 \| 55.6786 \| 33.7589 \| 55.6232 \| 55.6183 \| 18.9911 \|
	\| 0.1767 \| 15.0 \| 10110 \| 0.1553 \| 55.8132 \| 34.1603 \| 55.795 \| 55.7873 \| 18.9911 \|
	\| 0.1792 \| 16.0 \| 10784 \| 0.1539 \| 55.9694 \| 34.4612 \| 55.9454 \| 55.9323 \| 18.9792 \|
	\| 0.1785 \| 17.0 \| 11458 \| 0.1521 \| 56.2202 \| 34.6224 \| 56.1781 \| 56.1706 \| 18.9941 \|
	\| 0.1705 \| 18.0 \| 12132 \| 0.1496 \| 56.4102 \| 34.7821 \| 56.3911 \| 56.3789 \| 18.9911 \|
	\| 0.1668 \| 19.0 \| 12806 \| 0.1478 \| 56.1222 \| 34.6804 \| 56.0821 \| 56.077 \| 18.9881 \|
	\| 0.1729 \| 20.0 \| 13480 \| 0.1459 \| 56.1605 \| 34.8596 \| 56.1349 \| 56.1221 \| 18.9852 \|
	\| 0.1759 \| 21.0 \| 14154 \| 0.1451 \| 56.1232 \| 34.8956 \| 56.1054 \| 56.0994 \| 18.9852 \|
	\| 0.1713 \| 22.0 \| 14828 \| 0.1439 \| 55.9801 \| 34.6435 \| 55.9556 \| 55.9482 \| 18.9763 \|
	\| 0.1751 \| 23.0 \| 15502 \| 0.1436 \| 56.2088 \| 34.8754 \| 56.1771 \| 56.1758 \| 18.9852 \|
	\| 0.1626 \| 24.0 \| 16176 \| 0.1431 \| 56.0657 \| 34.7302 \| 56.04 \| 56.0317 \| 18.9852 \|
	\| 0.1696 \| 25.0 \| 16850 \| 0.1428 \| 56.0664 \| 34.7343 \| 56.0394 \| 56.0313 \| 18.9852 \|


	### Framework versions

	- Transformers 4.31.0
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.3
	- Tokenizers 0.13.3