BART (large-sized model), fine-tuned on Amazon Reviews (English Language)
The BART model was pre-trained on the CNN-DailyMail dataset, but it was re-trained on the Amazon's Website Purchase that were provided in English Language. The purpose of doing this was to build a pipeline that is designed to summarize user reviews on Amazon.com.
Model description
According to huggingface, BART is a transformer encoder-encoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. BART is pre-trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text.
Datasets
Link: Amazon Reviews Corpus
Intended uses & limitations
This model is intended to be used for summarizing user reviews on websites.
How to use
Here is how to use this model with the pipeline API:
from transformers import pipeline
summarizer = pipeline("summarization", model="mabrouk/reddit-summarizer-bart")
review = """ I really like this book. It takes a step-by-step approach to introduce the reader to the IBM Q Experience, to the basics underlying quantum computing, and to the reality of the noise involved in the current machines. This introduction is technical and shows the user how to use the IBM system either directly through the GUI on their website or by running Python code on one's own machine. The text provides examples of small exercises to try and stimulates ideas of new things to try. The IBM Q Exp Qiskit software modules are identified and introduced - Terra, Aer, Ignis, and Aqua, as well as the backends that one can choose to do the computing. The book ends with two great chapters on quantum algorithms.
"""
print(summarizer(review))
>>> [{'summary': 'I really like this book. It takes a step-by-step approach to introduce the reader to the IBM Q Experience, to the basics underlying quantum computing, and to the reality of the noise involved in the current machines. The book ends with two great ...'}]