Introduction

A led-large-16384 model to summarize ArXiv papers. Inputs are the abstracts of papers and full documents, and outputs are the summaries of the papers.

Allenai's Longformer Encoder-Decoder (LED).

As described in Longformer: The Long-Document Transformer by Iz Beltagy, Matthew E. Peters, Arman Cohan, led-base-16384 was initialized from bart-base since both models share the exact same architecture. To be able to process 16K tokens, bart-base's position embedding matrix was simply copied 16 times.

Downloads last month: 15

Safetensors

Model size

460M params

Tensor type

F32

Inference Examples

Summarization

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Spaces using AlgorithmicResearchGroup/led_large_16384_arxiv_summarization 4

Evaluation results

ROUGE-1 on ccdv/arxiv-summarization
test set verified

37.947
ROUGE-2 on ccdv/arxiv-summarization
test set verified

11.314
ROUGE-L on ccdv/arxiv-summarization
test set verified

20.556
ROUGE-LSUM on ccdv/arxiv-summarization
test set verified

33.834
loss on ccdv/arxiv-summarization
test set verified

2.806
gen_len on ccdv/arxiv-summarization
test set verified

157.417

View on Papers With Code