metadata

license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: Word-selector
    results: []

Word-selector

This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.1303
Rouge1: 0.3216
Rouge2: 0.0621
Rougel: 0.2469
Rougelsum: 0.2469
Gen Len: 48.8488

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 16
eval_batch_size: 12
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
No log	1.0	400	2.2490	0.1594	0.0161	0.1319	0.1321	69.2094
2.7083	2.0	800	2.1665	0.2025	0.0287	0.1648	0.1647	69.5888
2.369	3.0	1200	2.1296	0.2381	0.0344	0.1878	0.1878	57.9775
2.2185	4.0	1600	2.0890	0.2525	0.0399	0.1986	0.1984	60.2588
2.1014	5.0	2000	2.0731	0.2795	0.0484	0.2199	0.2199	49.5737
2.1014	6.0	2400	2.0601	0.2862	0.0525	0.2249	0.2246	54.4206
1.9992	7.0	2800	2.0592	0.3004	0.0533	0.2351	0.2351	49.9325
1.9232	8.0	3200	2.0529	0.3033	0.0558	0.2366	0.2368	49.8744
1.8534	9.0	3600	2.0600	0.3024	0.0573	0.2366	0.2366	50.355
1.795	10.0	4000	2.0715	0.3082	0.0561	0.2392	0.2392	47.2162
1.795	11.0	4400	2.0657	0.3137	0.0595	0.2437	0.2439	50.3438
1.73	12.0	4800	2.0759	0.3142	0.0597	0.2434	0.2433	51.1619
1.6844	13.0	5200	2.0818	0.3172	0.0605	0.2458	0.2458	48.9956
1.6398	14.0	5600	2.0942	0.3149	0.0599	0.2428	0.243	47.3812
1.6063	15.0	6000	2.1047	0.3171	0.0609	0.243	0.243	51.685
1.6063	16.0	6400	2.1095	0.3234	0.0622	0.248	0.248	50.1588
1.5659	17.0	6800	2.1180	0.3212	0.0627	0.2479	0.2478	49.0894
1.5456	18.0	7200	2.1212	0.3208	0.0616	0.2455	0.2456	48.8688
1.5177	19.0	7600	2.1275	0.3214	0.0628	0.2467	0.2467	48.4125
1.5161	20.0	8000	2.1303	0.3216	0.0621	0.2469	0.2469	48.8488

Framework versions

Transformers 4.37.2
Pytorch 2.1.1+cu121
Datasets 3.0.1
Tokenizers 0.15.1