Longformer Encoder-Decoder (LED) fine-tuned on Billsum
This model is a fine-tuned version of led-base-16384 on the billsum dataset.
As described in Longformer: The Long-Document Transformer by Iz Beltagy, Matthew E. Peters, Arman Cohan, led-base-16384 was initialized from bart-base since both models share the exact same architecture. To be able to process 16K tokens, bart-base's position embedding matrix was simply copied 16 times.
How to use
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
device = "cuda" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained("d0r1h/LEDBill")
model = AutoModelForSeq2SeqLM.from_pretrained("d0r1h/LEDBill", return_dict_in_generate=True).to(device)
case = "......."
input_ids = tokenizer(case, return_tensors="pt").input_ids.to(device)
global_attention_mask = torch.zeros_like(input_ids)
global_attention_mask[:, 0] = 1
sequences = model.generate(input_ids,
global_attention_mask=global_attention_mask).sequences
summary = tokenizer.batch_decode(sequences,
skip_special_tokens=True)
Evaluation results
When the model is used for summarizing Billsum documents(10 sample), it achieves the following results:
Model | rouge1-f | rouge1-p | rouge2-f | rouge2-p | rougeL-f | rougeL-p |
---|---|---|---|---|---|---|
LEDBill | 34 | 37 | 15 | 16 | 30 | 32 |
led-base | 2 | 15 | 0 | 0 | 2 | 15 |
This notebook shows how led can effectively be used for downstream task such summarization.
- Downloads last month
- 14
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Dataset used to train d0r1h/LEDBill
Space using d0r1h/LEDBill 1
Evaluation results
- ROUGE-1 on billsumtest set self-reported38.650
- ROUGE-2 on billsumtest set self-reported18.546
- ROUGE-L on billsumtest set self-reported25.656
- ROUGE-LSUM on billsumtest set self-reported33.157
- loss on billsumtest set self-reported2.131
- gen_len on billsumtest set self-reported288.372