|
--- |
|
license: apache-2.0 |
|
language: en |
|
--- |
|
|
|
# BART (base-sized model) |
|
|
|
BART model pre-trained on English language. It was introduced in the paper [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/abs/1910.13461) by Lewis et al. and first released in [this repository](https://github.com/pytorch/fairseq/tree/master/examples/bart). |
|
|
|
Disclaimer: The team releasing BART did not write a model card for this model so this model card has been written by the Hugging Face team. |
|
|
|
## Model description |
|
|
|
BART is a transformer encoder-decoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. BART is pre-trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. |
|
|
|
BART is particularly effective when fine-tuned for text generation (e.g. summarization, translation) but also works well for comprehension tasks (e.g. text classification, question answering). |
|
|
|
## Intended uses & limitations |
|
|
|
You can use the raw model for text infilling. However, the model is mostly meant to be fine-tuned on a supervised dataset. See the [model hub](https://huggingface.co/models?search=bart) to look for fine-tuned versions on a task that interests you. |
|
|
|
### How to use |
|
|
|
Here is how to use this model in PyTorch: |
|
|
|
```python |
|
from transformers import BartTokenizer, BartModel |
|
|
|
tokenizer = BartTokenizer.from_pretrained('facebook/bart-base') |
|
model = BartModel.from_pretrained('facebook/bart-base') |
|
|
|
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt") |
|
outputs = model(**inputs) |
|
|
|
last_hidden_states = outputs.last_hidden_state |
|
``` |
|
|
|
### BibTeX entry and citation info |
|
|
|
```bibtex |
|
@article{DBLP:journals/corr/abs-1910-13461, |
|
author = {Mike Lewis and |
|
Yinhan Liu and |
|
Naman Goyal and |
|
Marjan Ghazvininejad and |
|
Abdelrahman Mohamed and |
|
Omer Levy and |
|
Veselin Stoyanov and |
|
Luke Zettlemoyer}, |
|
title = {{BART:} Denoising Sequence-to-Sequence Pre-training for Natural Language |
|
Generation, Translation, and Comprehension}, |
|
journal = {CoRR}, |
|
volume = {abs/1910.13461}, |
|
year = {2019}, |
|
url = {http://arxiv.org/abs/1910.13461}, |
|
eprinttype = {arXiv}, |
|
eprint = {1910.13461}, |
|
timestamp = {Thu, 31 Oct 2019 14:02:26 +0100}, |
|
biburl = {https://dblp.org/rec/journals/corr/abs-1910-13461.bib}, |
|
bibsource = {dblp computer science bibliography, https://dblp.org} |
|
} |
|
``` |