debbiesoon/summarise_v9 · Hugging Face

This model is a fine-tuned version of allenai/led-base-16384 on the multi_news dataset. It achieves the following results on the evaluation set:

Loss: 2.3650
Rouge1 Precision: 0.4673
Rouge1 Recall: 0.4135
Rouge1 Fmeasure: 0.4263
Rouge2 Precision: 0.1579
Rouge2 Recall: 0.1426
Rouge2 Fmeasure: 0.1458
Rougel Precision: 0.2245
Rougel Recall: 0.2008
Rougel Fmeasure: 0.2061
Rougelsum Precision: 0.2245
Rougelsum Recall: 0.2008
Rougelsum Fmeasure: 0.2061

Model description

This model was created to generate summaries of news articles.

Intended uses & limitations

The model takes up to maximum article length of 3072 tokens and generates a summary of maximum length of 512 tokens, and minimum length of 100 tokens.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 2
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1 Precision	Rouge1 Recall	Rouge1 Fmeasure	Rouge2 Precision	Rouge2 Recall	Rouge2 Fmeasure	Rougel Precision	Rougel Recall	Rougel Fmeasure	Rougelsum Precision	Rougelsum Recall	Rougelsum Fmeasure
2.8095	0.16	10	2.5393	0.287	0.5358	0.3674	0.1023	0.1917	0.1311	0.1374	0.2615	0.1771	0.1374	0.2615	0.1771
2.6056	0.32	20	2.4752	0.5005	0.3264	0.3811	0.1663	0.1054	0.1249	0.2582	0.1667	0.1957	0.2582	0.1667	0.1957
2.5943	0.48	30	2.4422	0.4615	0.3833	0.4047	0.1473	0.1273	0.1321	0.2242	0.1885	0.1981	0.2242	0.1885	0.1981
2.4842	0.64	40	2.4186	0.4675	0.3829	0.4081	0.1581	0.1294	0.1384	0.2286	0.187	0.1995	0.2286	0.187	0.1995
2.4454	0.8	50	2.3990	0.467	0.408	0.4222	0.1633	0.1429	0.1477	0.2294	0.2008	0.2076	0.2294	0.2008	0.2076
2.3622	0.96	60	2.3857	0.4567	0.3898	0.41	0.1433	0.1233	0.1295	0.2205	0.1876	0.1976	0.2205	0.1876	0.1976
2.4034	1.13	70	2.3835	0.4515	0.4304	0.4294	0.1526	0.1479	0.1459	0.2183	0.209	0.2078	0.2183	0.209	0.2078
2.2612	1.29	80	2.3804	0.455	0.4193	0.4236	0.1518	0.1429	0.1427	0.2177	0.2025	0.2037	0.2177	0.2025	0.2037
2.2563	1.45	90	2.3768	0.4821	0.391	0.4196	0.1652	0.1357	0.144	0.2385	0.1929	0.2069	0.2385	0.1929	0.2069
2.243	1.61	100	2.3768	0.4546	0.4093	0.4161	0.1552	0.1402	0.1422	0.2248	0.2016	0.2052	0.2248	0.2016	0.2052
2.2505	1.77	110	2.3670	0.4625	0.4189	0.4262	0.1606	0.1485	0.1493	0.2301	0.2098	0.2119	0.2301	0.2098	0.2119
2.2453	1.93	120	2.3650	0.4673	0.4135	0.4263	0.1579	0.1426	0.1458	0.2245	0.2008	0.2061	0.2245	0.2008	0.2061

Framework versions

Transformers 4.21.3
Pytorch 1.12.1+cu113
Datasets 2.6.2.dev0
Tokenizers 0.12.1

debbiesoon
/

summarise_v9

Model description

Intended uses & limitations

Training hyperparameters

Training results

Framework versions

Dataset used to train debbiesoon/summarise_v9

Evaluation results