--- datasets: - bigscience/xP3mt - mc4 license: apache-2.0 language: - af - am - ar - az - be - bg - bn - ca - ceb - co - cs - cy - da - de - el - en - eo - es - et - eu - fa - fi - fil - fr - fy - ga - gd - gl - gu - ha - haw - hi - hmn - ht - hu - hy - ig - is - it - iw - ja - jv - ka - kk - km - kn - ko - ku - ky - la - lb - lo - lt - lv - mg - mi - mk - ml - mn - mr - ms - mt - my - ne - nl - 'no' - ny - pa - pl - ps - pt - ro - ru - sd - si - sk - sl - sm - sn - so - sq - sr - st - su - sv - sw - ta - te - tg - th - tr - uk - und - ur - uz - vi - xh - yi - yo - zh - zu tags: - text2text-generation widget: - text: Life is beautiful! Translate to Mongolian. example_title: mn-en translation - text: Le mot japonais «憂鬱» veut dire quoi en Odia? example_title: jp-or-fr translation - text: >- Stell mir eine schwierige Quiz Frage bei der es um Astronomie geht. Bitte stell die Frage auf Norwegisch. example_title: de-nb quiz - text: >- 一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。Would you rate the previous review as positive, neutral or negative? example_title: zh-en sentiment - text: 一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。你认为这句话的立场是赞扬、中立还是批评? example_title: zh-zh sentiment - text: Suggest at least five related search terms to "Mạng neural nhân tạo". example_title: vi-en query - text: >- Proposez au moins cinq mots clés concernant «Réseau de neurones artificiels». example_title: fr-fr query - text: Explain in a sentence in Telugu what is backpropagation in neural networks. example_title: te-en qa - text: Why is the sky blue? example_title: en-en qa - text: >- Write a fairy tale about a troll saving a princess from a dangerous dragon. The fairy tale is a masterpiece that has achieved praise worldwide and its moral is "Heroes Come in All Shapes and Sizes". Story (in Spanish): example_title: es-en fable - text: >- Write a fable about wood elves living in a forest that is suddenly invaded by ogres. The fable is a masterpiece that has achieved praise worldwide and its moral is "Violence is the last refuge of the incompetent". Fable (in Hindi): example_title: hi-en fable model-index: - name: mt0-xxl-mt results: - task: type: Coreference resolution dataset: type: winogrande name: Winogrande XL (xl) config: xl split: validation revision: a80f460359d1e9a67c006011c94de42a8759430c metrics: - type: Accuracy value: 62.67 - task: type: Coreference resolution dataset: type: Muennighoff/xwinograd name: XWinograd (en) config: en split: test revision: 9dd5ea5505fad86b7bedad667955577815300cee metrics: - type: Accuracy value: 83.31 - task: type: Coreference resolution dataset: type: Muennighoff/xwinograd name: XWinograd (fr) config: fr split: test revision: 9dd5ea5505fad86b7bedad667955577815300cee metrics: - type: Accuracy value: 78.31 - task: type: Coreference resolution dataset: type: Muennighoff/xwinograd name: XWinograd (jp) config: jp split: test revision: 9dd5ea5505fad86b7bedad667955577815300cee metrics: - type: Accuracy value: 80.19 - task: type: Coreference resolution dataset: type: Muennighoff/xwinograd name: XWinograd (pt) config: pt split: test revision: 9dd5ea5505fad86b7bedad667955577815300cee metrics: - type: Accuracy value: 80.99 - task: type: Coreference resolution dataset: type: Muennighoff/xwinograd name: XWinograd (ru) config: ru split: test revision: 9dd5ea5505fad86b7bedad667955577815300cee metrics: - type: Accuracy value: 79.05 - task: type: Coreference resolution dataset: type: Muennighoff/xwinograd name: XWinograd (zh) config: zh split: test revision: 9dd5ea5505fad86b7bedad667955577815300cee metrics: - type: Accuracy value: 82.34 - task: type: Natural language inference dataset: type: anli name: ANLI (r1) config: r1 split: validation revision: 9dbd830a06fea8b1c49d6e5ef2004a08d9f45094 metrics: - type: Accuracy value: 49.5 - task: type: Natural language inference dataset: type: anli name: ANLI (r2) config: r2 split: validation revision: 9dbd830a06fea8b1c49d6e5ef2004a08d9f45094 metrics: - type: Accuracy value: 42 - task: type: Natural language inference dataset: type: anli name: ANLI (r3) config: r3 split: validation revision: 9dbd830a06fea8b1c49d6e5ef2004a08d9f45094 metrics: - type: Accuracy value: 48.17 - task: type: Natural language inference dataset: type: super_glue name: SuperGLUE (cb) config: cb split: validation revision: 9e12063561e7e6c79099feb6d5a493142584e9e2 metrics: - type: Accuracy value: 87.5 - task: type: Natural language inference dataset: type: super_glue name: SuperGLUE (rte) config: rte split: validation revision: 9e12063561e7e6c79099feb6d5a493142584e9e2 metrics: - type: Accuracy value: 84.84 - task: type: Natural language inference dataset: type: xnli name: XNLI (ar) config: ar split: validation revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 metrics: - type: Accuracy value: 58.03 - task: type: Natural language inference dataset: type: xnli name: XNLI (bg) config: bg split: validation revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 metrics: - type: Accuracy value: 59.92 - task: type: Natural language inference dataset: type: xnli name: XNLI (de) config: de split: validation revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 metrics: - type: Accuracy value: 60.16 - task: type: Natural language inference dataset: type: xnli name: XNLI (el) config: el split: validation revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 metrics: - type: Accuracy value: 59.2 - task: type: Natural language inference dataset: type: xnli name: XNLI (en) config: en split: validation revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 metrics: - type: Accuracy value: 62.25 - task: type: Natural language inference dataset: type: xnli name: XNLI (es) config: es split: validation revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 metrics: - type: Accuracy value: 60.92 - task: type: Natural language inference dataset: type: xnli name: XNLI (fr) config: fr split: validation revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 metrics: - type: Accuracy value: 59.88 - task: type: Natural language inference dataset: type: xnli name: XNLI (hi) config: hi split: validation revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 metrics: - type: Accuracy value: 57.47 - task: type: Natural language inference dataset: type: xnli name: XNLI (ru) config: ru split: validation revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 metrics: - type: Accuracy value: 58.67 - task: type: Natural language inference dataset: type: xnli name: XNLI (sw) config: sw split: validation revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 metrics: - type: Accuracy value: 56.79 - task: type: Natural language inference dataset: type: xnli name: XNLI (th) config: th split: validation revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 metrics: - type: Accuracy value: 58.03 - task: type: Natural language inference dataset: type: xnli name: XNLI (tr) config: tr split: validation revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 metrics: - type: Accuracy value: 57.67 - task: type: Natural language inference dataset: type: xnli name: XNLI (ur) config: ur split: validation revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 metrics: - type: Accuracy value: 55.98 - task: type: Natural language inference dataset: type: xnli name: XNLI (vi) config: vi split: validation revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 metrics: - type: Accuracy value: 58.92 - task: type: Natural language inference dataset: type: xnli name: XNLI (zh) config: zh split: validation revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 metrics: - type: Accuracy value: 58.71 - task: type: Sentence completion dataset: type: story_cloze name: StoryCloze (2016) config: '2016' split: validation revision: e724c6f8cdf7c7a2fb229d862226e15b023ee4db metrics: - type: Accuracy value: 94.66 - task: type: Sentence completion dataset: type: super_glue name: SuperGLUE (copa) config: copa split: validation revision: 9e12063561e7e6c79099feb6d5a493142584e9e2 metrics: - type: Accuracy value: 88 - task: type: Sentence completion dataset: type: xcopa name: XCOPA (et) config: et split: validation revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187 metrics: - type: Accuracy value: 81 - task: type: Sentence completion dataset: type: xcopa name: XCOPA (ht) config: ht split: validation revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187 metrics: - type: Accuracy value: 79 - task: type: Sentence completion dataset: type: xcopa name: XCOPA (id) config: id split: validation revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187 metrics: - type: Accuracy value: 90 - task: type: Sentence completion dataset: type: xcopa name: XCOPA (it) config: it split: validation revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187 metrics: - type: Accuracy value: 88 - task: type: Sentence completion dataset: type: xcopa name: XCOPA (qu) config: qu split: validation revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187 metrics: - type: Accuracy value: 56 - task: type: Sentence completion dataset: type: xcopa name: XCOPA (sw) config: sw split: validation revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187 metrics: - type: Accuracy value: 81 - task: type: Sentence completion dataset: type: xcopa name: XCOPA (ta) config: ta split: validation revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187 metrics: - type: Accuracy value: 81 - task: type: Sentence completion dataset: type: xcopa name: XCOPA (th) config: th split: validation revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187 metrics: - type: Accuracy value: 76 - task: type: Sentence completion dataset: type: xcopa name: XCOPA (tr) config: tr split: validation revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187 metrics: - type: Accuracy value: 76 - task: type: Sentence completion dataset: type: xcopa name: XCOPA (vi) config: vi split: validation revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187 metrics: - type: Accuracy value: 85 - task: type: Sentence completion dataset: type: xcopa name: XCOPA (zh) config: zh split: validation revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187 metrics: - type: Accuracy value: 87 - task: type: Sentence completion dataset: type: Muennighoff/xstory_cloze name: XStoryCloze (ar) config: ar split: validation revision: 8bb76e594b68147f1a430e86829d07189622b90d metrics: - type: Accuracy value: 91 - task: type: Sentence completion dataset: type: Muennighoff/xstory_cloze name: XStoryCloze (es) config: es split: validation revision: 8bb76e594b68147f1a430e86829d07189622b90d metrics: - type: Accuracy value: 93.38 - task: type: Sentence completion dataset: type: Muennighoff/xstory_cloze name: XStoryCloze (eu) config: eu split: validation revision: 8bb76e594b68147f1a430e86829d07189622b90d metrics: - type: Accuracy value: 91.13 - task: type: Sentence completion dataset: type: Muennighoff/xstory_cloze name: XStoryCloze (hi) config: hi split: validation revision: 8bb76e594b68147f1a430e86829d07189622b90d metrics: - type: Accuracy value: 90.73 - task: type: Sentence completion dataset: type: Muennighoff/xstory_cloze name: XStoryCloze (id) config: id split: validation revision: 8bb76e594b68147f1a430e86829d07189622b90d metrics: - type: Accuracy value: 93.05 - task: type: Sentence completion dataset: type: Muennighoff/xstory_cloze name: XStoryCloze (my) config: my split: validation revision: 8bb76e594b68147f1a430e86829d07189622b90d metrics: - type: Accuracy value: 86.7 - task: type: Sentence completion dataset: type: Muennighoff/xstory_cloze name: XStoryCloze (ru) config: ru split: validation revision: 8bb76e594b68147f1a430e86829d07189622b90d metrics: - type: Accuracy value: 91.66 - task: type: Sentence completion dataset: type: Muennighoff/xstory_cloze name: XStoryCloze (sw) config: sw split: validation revision: 8bb76e594b68147f1a430e86829d07189622b90d metrics: - type: Accuracy value: 89.61 - task: type: Sentence completion dataset: type: Muennighoff/xstory_cloze name: XStoryCloze (te) config: te split: validation revision: 8bb76e594b68147f1a430e86829d07189622b90d metrics: - type: Accuracy value: 90.4 - task: type: Sentence completion dataset: type: Muennighoff/xstory_cloze name: XStoryCloze (zh) config: zh split: validation revision: 8bb76e594b68147f1a430e86829d07189622b90d metrics: - type: Accuracy value: 93.05 pipeline_tag: text2text-generation --- ![xmtf](https://github.com/bigscience-workshop/xmtf/blob/master/xmtf_banner.png?raw=true) # Table of Contents 1. [Model Summary](#model-summary) 2. [Use](#use) 3. [Limitations](#limitations) 4. [Training](#training) 5. [Evaluation](#evaluation) 7. [Citation](#citation) # Model Summary > We present BLOOMZ & mT0, a family of models capable of following human instructions in dozens of languages zero-shot. We finetune BLOOM & mT5 pretrained multilingual language models on our crosslingual task mixture (xP3) and find our resulting models capable of crosslingual generalization to unseen tasks & languages. - **Repository:** [bigscience-workshop/xmtf](https://github.com/bigscience-workshop/xmtf) - **Paper:** [Crosslingual Generalization through Multitask Finetuning](https://arxiv.org/abs/2211.01786) - **Point of Contact:** [Niklas Muennighoff](mailto:niklas@hf.co) - **Languages:** Refer to [mc4](https://huggingface.co/datasets/mc4) for pretraining & [xP3](https://huggingface.co/bigscience/xP3) for finetuning language proportions. It understands both pretraining & finetuning languages. - **BLOOMZ & mT0 Model Family:**
Multitask finetuned on xP3. Recommended for prompting in English. | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Parameters | 300M | 580M | 1.2B | 3.7B | 13B | 560M | 1.1B | 1.7B | 3B | 7.1B | 176B |
Finetuned Model | mt0-small | mt0-base | mt0-large | mt0-xl | mt0-xxl | bloomz-560m | bloomz-1b1 | bloomz-1b7 | bloomz-3b | bloomz-7b1 | bloomz |
Multitask finetuned on xP3mt. Recommended for prompting in non-English. | |||||||||||
Finetuned Model | mt0-xxl-mt | bloomz-7b1-mt | bloomz-mt | Multitask finetuned on P3. Released for research purposes only. Strictly inferior to above models! | |||||||
Finetuned Model | mt0-xxl-p3 | bloomz-7b1-p3 | bloomz-p3 | Original pretrained checkpoints. Not recommended. | |||||||
Pretrained Model | mt5-small | mt5-base | mt5-large | mt5-xl | mt5-xxl | bloom-560m | bloom-1b1 | bloom-1b7 | bloom-3b | bloom-7b1 | bloom |