jbochi
/

flan-t5-large-spelling-peft

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

flan-t5-large-spelling-peft / README.md

jbochi's picture

End of training

33bca89 11 months ago

|

3.33 kB

	---
	license: apache-2.0
	base_model: google/flan-t5-large
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: flan-t5-large-spelling-peft
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# flan-t5-large-spelling-peft

	This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.2537
	- Rouge1: 95.8905
	- Rouge2: 91.9178
	- Rougel: 95.8459
	- Rougelsum: 95.8393
	- Gen Len: 33.61

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.001
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 1

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:-------:\|
	\| 0.3359 \| 0.05 \| 500 \| 0.2738 \| 95.8385 \| 91.6723 \| 95.7821 \| 95.766 \| 33.5 \|
	\| 0.2853 \| 0.11 \| 1000 \| 0.2702 \| 95.7124 \| 91.5043 \| 95.656 \| 95.651 \| 33.53 \|
	\| 0.2691 \| 0.16 \| 1500 \| 0.2691 \| 95.735 \| 91.7108 \| 95.7039 \| 95.7067 \| 33.41 \|
	\| 0.2596 \| 0.21 \| 2000 \| 0.2663 \| 95.9819 \| 92.0897 \| 95.9519 \| 95.9488 \| 33.51 \|
	\| 0.2536 \| 0.27 \| 2500 \| 0.2621 \| 95.7519 \| 91.5445 \| 95.6614 \| 95.6622 \| 33.49 \|
	\| 0.2472 \| 0.32 \| 3000 \| 0.2626 \| 95.7052 \| 91.7321 \| 95.6476 \| 95.6512 \| 33.58 \|
	\| 0.2448 \| 0.37 \| 3500 \| 0.2669 \| 95.8003 \| 91.7949 \| 95.7536 \| 95.7576 \| 33.57 \|
	\| 0.2345 \| 0.43 \| 4000 \| 0.2582 \| 95.8784 \| 92.008 \| 95.8284 \| 95.8343 \| 33.65 \|
	\| 0.2345 \| 0.48 \| 4500 \| 0.2629 \| 95.8131 \| 91.9088 \| 95.7624 \| 95.766 \| 33.63 \|
	\| 0.2284 \| 0.53 \| 5000 \| 0.2585 \| 95.8552 \| 91.9833 \| 95.8105 \| 95.8135 \| 33.62 \|
	\| 0.2266 \| 0.59 \| 5500 \| 0.2591 \| 95.9205 \| 92.0577 \| 95.8689 \| 95.8718 \| 33.61 \|
	\| 0.2281 \| 0.64 \| 6000 \| 0.2605 \| 95.9172 \| 91.9782 \| 95.874 \| 95.8638 \| 33.59 \|
	\| 0.2228 \| 0.69 \| 6500 \| 0.2566 \| 95.7612 \| 91.7858 \| 95.7129 \| 95.7058 \| 33.63 \|
	\| 0.2202 \| 0.75 \| 7000 \| 0.2561 \| 95.9468 \| 92.0914 \| 95.9018 \| 95.8941 \| 33.64 \|
	\| 0.218 \| 0.8 \| 7500 \| 0.2579 \| 95.9468 \| 92.0914 \| 95.9018 \| 95.8941 \| 33.64 \|
	\| 0.2162 \| 0.85 \| 8000 \| 0.2523 \| 95.8231 \| 91.9464 \| 95.7727 \| 95.7758 \| 33.66 \|
	\| 0.2135 \| 0.91 \| 8500 \| 0.2549 \| 95.8388 \| 91.9804 \| 95.7914 \| 95.7917 \| 33.63 \|
	\| 0.2124 \| 0.96 \| 9000 \| 0.2537 \| 95.8905 \| 91.9178 \| 95.8459 \| 95.8393 \| 33.61 \|


	### Framework versions

	- Transformers 4.35.2
	- Pytorch 2.1.0+cu121
	- Datasets 2.16.0
	- Tokenizers 0.15.0