metadata

license: apache-2.0
base_model: t5-small
tags:
  - generated_from_trainer
datasets:
  - xsum
metrics:
  - rouge
model-index:
  - name: fastSUMMARIZER-t5-small-finetuned-on-xsum
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: xsum
          type: xsum
          config: default
          split: validation
          args: default
        metrics:
          - name: Rouge1
            type: rouge
            value: 31.3222
pipeline_tag: summarization
widget:
  - text: >-
      There will soon be flying taxis. Many of us grew up watching science
      fiction movies with these. The Japanese airline ANA and a U.S. tech
      start-up called Joby Aviation will fly air taxis at the 2025 World Expo in
      Osaka. They are currently building the taxis. They will need to follow air
      traffic rules. They will also need to train flying taxi pilots. The
      five-seat, all-electric taxi will take off and land vertically. It will
      fly as far as 241 kilometers and have a top speed of 321kph. Joby said the
      taxis are environmentally friendly. People can reduce their carbon
      footprint. It said Japan was a great place to test the taxis because 92
      per cent of the population live in towns and cities. The president of ANA
      said the airline has 70 years of safe and reliable flights. He said it was
      good that customers have 'the option to travel rapidly, and sustainably,
      from an international airport to a downtown location'. 
  - text: >-
      Everybody knows that eating carrots is good for our eyesight. A new study
      suggests that grapes are also good for our eyes. Researchers from the
      National University of Singapore have found that eating just a few grapes
      a day can improve our vision. This is especially so for people who are
      older. Dr Eun Kim, the lead researcher, said: 'Our study is the first to
      show that grape consumption beneficially impacts eye health in humans,
      which is very exciting, especially with a growing, ageing population.' Dr
      Kim added that, 'grapes are an easily accessible fruit that studies have
      shown can have a beneficial impact' on our eyesight. This is good news for
      people who don't really like carrots. The study is published in the
      journal 'Food & Function'. Thirty-four adults took part in a series of
      experiments over 16 weeks. Half of the participants ate one-and-a-half
      cups of grapes per day; the other half ate a placebo snack. Dr Kim did not
      tell the participants or the researchers whether she was testing the
      grapes or the snack. She thought that not revealing this information would
      give better test results. She found that people who ate the grapes had
      improved muscle strength around the retina. The retina passes information
      about light to the brain via electrical signals. It protects the eyes from
      damaging blue light. A lot of blue light comes from computer and
      smartphone screens, and from LED lights.

t5-small-finetuned-summarization-xsum

This model is a fine-tuned version of t5-small on the xsum dataset. It is very fast and light. The model summarizes a whole text in just <1s, making it very efficient for low resource usage.

Model Demo:

https://huggingface.co/spaces/Rahmat82/RHM-text-summarizer-light

It achieves the following results on the evaluation set:

Loss: 2.2425
Rouge1: 31.3222
Rouge2: 10.0614
Rougel: 25.0513
Rougelsum: 25.0446
Gen Len: 18.8044

Model description

This model is light and performs very fast. No matter on GPU or CPU, it always summarizes your text in <1s. If you use optimum, it may get even faster.

Click the following link to open the model's demo:
https://huggingface.co/spaces/Rahmat82/RHM-text-summarizer-light

Use the model:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, pipeline

model_id = "Rahmat82/t5-small-finetuned-summarization-xsum"

model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)
summarizer = pipeline("summarization",model = model, tokenizer=tokenizer)

text_to_summarize = """
The koala is regarded as the epitome of cuddliness. However, animal lovers
will be saddened to hear that this lovable marsupial has been moved to the
endangered species list. The Australian Koala Foundation estimates there are
somewhere between 43,000-100,000 koalas left in the wild. Their numbers have
been dwindling rapidly due to disease, loss of habitat, bushfires, being hit
by cars, and other threats. Stuart Blanch from the World Wildlife Fund in
Australia said: "Koalas have gone from no listing to vulnerable to endangered
within a decade. That is a shockingly fast decline." He added that koalas risk
"sliding toward extinction" 
"""


print(summarizer(text_to_summarize)[0]["summary_text"])

Use model with optimum/onnxruntime - super fast:

#!pip install -q transformers accelerate optimum onnxruntime onnx

from transformers import AutoTokenizer
from optimum.onnxruntime import ORTModelForSeq2SeqLM
from optimum.pipelines import pipeline
import accelerate

model_name = "Rahmat82/t5-small-finetuned-summarization-xsum"

model = ORTModelForSeq2SeqLM.from_pretrained(model_name, export=True)
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
summarizer = pipeline("summarization", model=model, tokenizer=tokenizer,
                                       device_map="auto", batch_size=12)

text_to_summarize = """
The koala is regarded as the epitome of cuddliness. However, animal lovers
will be saddened to hear that this lovable marsupial has been moved to the
endangered species list. The Australian Koala Foundation estimates there are
somewhere between 43,000-100,000 koalas left in the wild. Their numbers have
been dwindling rapidly due to disease, loss of habitat, bushfires, being hit
by cars, and other threats. Stuart Blanch from the World Wildlife Fund in
Australia said: "Koalas have gone from no listing to vulnerable to endangered
within a decade. That is a shockingly fast decline." He added that koalas risk
"sliding toward extinction" 
"""

print(summarizer(text_to_summarize)[0]["summary_text"])

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 28
eval_batch_size: 28
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 2
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
2.5078	1.0	7288	2.2860	30.9087	9.7673	24.6951	24.6927	18.7973
2.4245	2.0	14576	2.2425	31.3222	10.0614	25.0513	25.0446	18.8044

Framework versions

Transformers 4.37.0
Pytorch 2.1.2
Datasets 2.1.0
Tokenizers 0.15.1