facebook
/

blenderbot-3B

@@ -1,4 +1,3 @@
 ---
 language:
 - en
@@ -14,97 +13,14 @@ metrics:
 - perplexity
 ---
-# Blenderbot-3B
 ## Model description
-+ [Paper](https://arxiv.org/abs/1907.06616).
-+ [Original PARLAI Code]
-The abbreviation FSMT stands for FairSeqMachineTranslation
-All four models are available:
-* [wmt19-en-ru](https://huggingface.co/facebook/wmt19-en-ru)
-* [wmt19-ru-en](https://huggingface.co/facebook/wmt19-ru-en)
-* [wmt19-en-de](https://huggingface.co/facebook/wmt19-en-de)
-* [wmt19-de-en](https://huggingface.co/facebook/wmt19-de-en)
-## Intended uses & limitations
-#### How to use
-```python
-from transformers.tokenization_fsmt import FSMTTokenizer
-from transformers.modeling_fsmt import FSMTForConditionalGeneration
-mname = "facebook/wmt19-en-ru"
-tokenizer = FSMTTokenizer.from_pretrained(mname)
-model = FSMTForConditionalGeneration.from_pretrained(mname)
-input = "Machine learning is great, isn't it?"
-input_ids = tokenizer.encode(input, return_tensors="pt")
-outputs = model.generate(input_ids)
-decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)
-print(decoded) # Машинное обучение - это здорово, не так ли?
-```
-#### Limitations and bias
-- The original (and this ported model) doesn't seem to handle well inputs with repeated sub-phrases, [content gets truncated](https://discuss.huggingface.co/t/issues-with-translating-inputs-containing-repeated-phrases/981)
-## Training data
-Pretrained weights were left identical to the original model released by fairseq. For more details, please, see the [paper](https://arxiv.org/abs/1907.06616).
-## Eval results
-pair   | fairseq | transformers
--------|---------|----------
-en-ru  | [36.4](http://matrix.statmt.org/matrix/output/1914?run_id=6724) | 33.47
-The score is slightly below the score reported by `fairseq`, since `transformers`` currently doesn't support:
-- model ensemble, therefore the best performing checkpoint was ported (``model4.pt``).
-- re-ranking
-The score was calculated using this code:
-```bash
-git clone https://github.com/huggingface/transformers
-cd transformers
-export PAIR=en-ru
-export DATA_DIR=data/$PAIR
-export SAVE_DIR=data/$PAIR
-export BS=8
-export NUM_BEAMS=15
-mkdir -p $DATA_DIR
-sacrebleu -t wmt19 -l $PAIR --echo src > $DATA_DIR/val.source
-sacrebleu -t wmt19 -l $PAIR --echo ref > $DATA_DIR/val.target
-echo $PAIR
-PYTHONPATH="src:examples/seq2seq" python examples/seq2seq/run_eval.py facebook/wmt19-$PAIR $DATA_DIR/val.source $SAVE_DIR/test_translations.txt --reference_path $DATA_DIR/val.target --score_path $SAVE_DIR/test_bleu.json --bs $BS --task translation --num_beams $NUM_BEAMS
-```
-note: fairseq reports using a beam of 50, so you should get a slightly higher score if re-run with `--num_beams 50`.
-## Data Sources
-- [training, etc.](http://www.statmt.org/wmt19/)
-- [test set](http://matrix.statmt.org/test_sets/newstest2019.tgz?1556572561)
-### BibTeX entry and citation info
-```bibtex
-@inproceedings{...,
-  year={2020},
-  title={Facebook FAIR's WMT19 News Translation Task Submission},
-  author={Ng, Nathan and Yee, Kyra and Baevski, Alexei and Ott, Myle and Auli, Michael and Edunov, Sergey},
-  booktitle={Proc. of WMT},
-}
-```
-## TODO
-- port model ensemble (fairseq uses 4 model checkpoints)

 ---
 language:
 - en
 - perplexity
 ---
 ## Model description
++ Paper: [Recipes for building an open-domain chatbot](https://arxiv.org/abs/1907.06616)
++ [Original PARLAI Code](https://parl.ai/projects/recipes/)
+### Abstract
+Building open-domain chatbots is a challenging area for machine learning research. While prior work has shown that scaling neural models in the number of parameters and the size of the data they are trained on gives improved results, we show that other ingredients are important for a high-performing chatbot. Good conversation requires a number of skills that an expert conversationalist blends in a seamless way: providing engaging talking points and listening to their partners, both asking and answering questions, and displaying knowledge, empathy and personality appropriately, depending on the situation. We show that large scale models can learn these skills when given appropriate training data and choice of generation strategy. We build variants of these recipes with 90M, 2.7B and 9.4B parameter neural models, and make our models and code publicly available. Human evaluations show our best models are superior to existing approaches in multi-turn dialogue in terms of engagingness and humanness measurements. We then discuss the limitations of this work by analyzing failure cases of our models.