Bart-Large Summarization Model

Bart Logo

This repository contains the Bart-Large-paper2slides-summarizer Model, which has been fine-tuned on the Automatic Slide Generation from Scientific Papers dataset using unsupervised learning techniques using an algorithm from the paper entitled 'Unsupervised Machine Translation Using Monolingual Corpora Only'. Its primary focus is to summarize scientific texts with precision and accuracy, the model is parallelly trained with the Bart-large-paper2slides-expander from the same contributor.

Model Details

Bart (Bidirectional and Auto-Regressive Transformers) is a sequence-to-sequence (seq2seq) model developed by Facebook AI Research. It has shown exceptional performance in various natural language processing (NLP) tasks such as text summarization, text generation, and machine translation.

This particular model, Bart-Large, is the larger version of the Bart model. It consists of 12 encoder and decoder layers and has a total of 400 million parameters.

Usage

To use this model, you can leverage the Hugging Face Transformers library. Here's an example of how to use it in Python:

from transformers import BartTokenizer, BartForConditionalGeneration, pipeline

# Load the model and tokenizer
model_name = "com3dian/Bart-large-paper2slides-summarizer"
tokenizer = BartTokenizer.from_pretrained(model_name)
model = BartForConditionalGeneration.from_pretrained(model_name)

# Generate summary from input text
input_text = "Your input text here..."
input_ids = tokenizer.encode(input_text, return_tensors="pt")
output = model.generate(input_ids)

# Decode generated summaries
summary = tokenizer.decode(output[0], skip_special_tokens=True)
print(summary)

# Or using the pipeline API
summarizer = pipeline("summarization", model=model_name)
summary = summarizer(input_text, max_length=50, min_length=30, do_sample=False)
print(summary)

Ensure you have the transformers library installed before running the code. You can install it using pip:

pip install transformers

Model Fine-tuning Details

The fine-tuning process for this model involved training on the slide generation dataset using unsupervised learning techniques. Unsupervised learning refers to training a model without explicit human-labeled targets. Instead, the model learns to back-summarize the input provided by the expansion model, into the original texts.

The specific hyperparameters and training details used for fine-tuning this model are as follows:

  • Batch Size: 4
  • Learning Rate: 2e-6
  • Training Steps: 3*7
  • Optimizer: AdamW

Model Performance

The Bart-Large Slide Generation Model has undergone thorough human evaluation in a wide range of scientific domains, including AI, mathematics, statistics, history, geography, and climate science, to compare its performance with the Bart-large-cnn model.

Acknowledgments

We would like to acknowledge the authors of the Bart model and the creators of the slide generation dataset for their valuable contributions, which have enabled the development of this fine-tuned model.

If you use this model or find it helpful in your work, please consider citing the original Bart model, the slide generation dataset, and this paper to provide proper credit to the respective authors.

License

This model and the associated code are released under the MIT license.

Downloads last month
240
Safetensors
Model size
406M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.