debbiesoon/summarise_v10

This model is a fine-tuned version of allenai/led-base-16384 on the SGH news articles and summaries dataset. It achieves the following results on the evaluation set:

Loss: 1.9680
Rouge1 Precision: 0.4404
Rouge1 Recall: 0.5874
Rouge1 Fmeasure: 0.4653
Rouge2 Precision: 0.2673
Rouge2 Recall: 0.3871
Rouge2 Fmeasure: 0.2897
Rougel Precision: 0.3059
Rougel Recall: 0.4418
Rougel Fmeasure: 0.3308
Rougelsum Precision: 0.3059
Rougelsum Recall: 0.4418
Rougelsum Fmeasure: 0.3308

Model description

This model was created to generate summaries of news articles.

Intended uses & limitations

The model takes up to maximum article length of 3072 tokens and generates a summary of maximum length of 512 tokens, and minimum length of 100 tokens.

Training and evaluation data

This model was trained on 100+ articles and summaries from SGH.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1 Precision	Rouge1 Recall	Rouge1 Fmeasure	Rouge2 Precision	Rouge2 Recall	Rouge2 Fmeasure	Rougel Precision	Rougel Recall	Rougel Fmeasure	Rougelsum Precision	Rougelsum Recall	Rougelsum Fmeasure
1.4834	0.43	10	1.7001	0.2304	0.6761	0.3152	0.1326	0.4034	0.1797	0.1495	0.4624	0.2069	0.1495	0.4624	0.2069
1.5011	0.87	20	1.6051	0.4301	0.5372	0.4087	0.2481	0.3439	0.245	0.2878	0.3928	0.2834	0.2878	0.3928	0.2834
0.9289	1.3	30	1.5501	0.431	0.597	0.4364	0.2653	0.393	0.2736	0.3007	0.4233	0.3037	0.3007	0.4233	0.3037
1.0895	1.74	40	1.5969	0.4661	0.5481	0.4486	0.2736	0.3439	0.2689	0.3318	0.4045	0.3221	0.3318	0.4045	0.3221
0.7785	2.17	50	1.5875	0.4527	0.5405	0.4209	0.2942	0.3634	0.272	0.3268	0.4047	0.3042	0.3268	0.4047	0.3042
0.635	2.61	60	1.6081	0.4142	0.5649	0.4172	0.242	0.3659	0.2549	0.2787	0.4156	0.2909	0.2787	0.4156	0.2909
0.514	3.04	70	1.6150	0.4431	0.5665	0.4569	0.2656	0.3754	0.2853	0.3252	0.441	0.3434	0.3252	0.441	0.3434
0.5617	3.48	80	1.6447	0.3956	0.6304	0.451	0.2353	0.425	0.2776	0.2883	0.4904	0.3332	0.2883	0.4904	0.3332
0.396	3.91	90	1.7423	0.4276	0.609	0.4506	0.2657	0.4142	0.2858	0.3091	0.4677	0.3316	0.3091	0.4677	0.3316
0.3427	4.35	100	1.7572	0.3877	0.5633	0.4169	0.216	0.3635	0.2468	0.2706	0.4314	0.3018	0.2706	0.4314	0.3018
0.3059	4.78	110	1.7705	0.4255	0.5524	0.4429	0.2495	0.3488	0.2671	0.3184	0.4275	0.3358	0.3184	0.4275	0.3358
0.2083	5.22	120	1.7840	0.4533	0.5896	0.4655	0.284	0.4142	0.308	0.3164	0.4442	0.3376	0.3164	0.4442	0.3376
0.2591	5.65	130	1.8396	0.4391	0.5315	0.4209	0.2768	0.3661	0.2707	0.3194	0.4124	0.3111	0.3194	0.4124	0.3111
0.2609	6.09	140	1.8220	0.4425	0.5712	0.4465	0.2642	0.3738	0.2727	0.3093	0.4349	0.3208	0.3093	0.4349	0.3208
0.1696	6.52	150	1.8916	0.475	0.5557	0.4686	0.2959	0.3783	0.3019	0.3409	0.4268	0.3442	0.3409	0.4268	0.3442
0.2683	6.96	160	1.8957	0.445	0.5918	0.4748	0.285	0.4021	0.3075	0.3249	0.4551	0.3522	0.3249	0.4551	0.3522
0.1259	7.39	170	1.9371	0.4473	0.5368	0.4664	0.2608	0.3355	0.282	0.3276	0.4071	0.3492	0.3276	0.4071	0.3492
0.1919	7.83	180	1.9521	0.4026	0.5528	0.438	0.2362	0.3427	0.2604	0.2751	0.3957	0.3042	0.2751	0.3957	0.3042
0.1279	8.26	190	1.9398	0.413	0.6053	0.4575	0.2511	0.403	0.2881	0.2662	0.4195	0.3027	0.2662	0.4195	0.3027
0.1176	8.7	200	1.9556	0.4363	0.565	0.4492	0.2591	0.3727	0.2806	0.3107	0.428	0.3289	0.3107	0.428	0.3289
0.1299	9.13	210	1.9642	0.4385	0.5728	0.4587	0.2687	0.3744	0.2888	0.3212	0.436	0.3404	0.3212	0.436	0.3404
0.1303	9.57	220	1.9649	0.43	0.5648	0.439	0.2605	0.3624	0.2691	0.2958	0.4135	0.3067	0.2958	0.4135	0.3067
0.1129	10.0	230	1.9680	0.4404	0.5874	0.4653	0.2673	0.3871	0.2897	0.3059	0.4418	0.3308	0.3059	0.4418	0.3308

Framework versions

Transformers 4.21.3
Pytorch 1.12.1+cu113
Datasets 1.2.1
Tokenizers 0.12.1