bloomz / README.md

Muennighoff

Update README.md

99b17ec almost 2 years ago

preview code

raw

history blame

No virus

6.06 kB

	---
	license: bigscience-bloom-rail-1.0
	datasets:
	- bigscience/xP3
	language:
	- ak
	- ar
	- as
	- bm
	- bn
	- ca
	- code
	- en
	- es
	- eu
	- fon
	- fr
	- gu
	- hi
	- id
	- ig
	- ki
	- kn
	- lg
	- ln
	- ml
	- mr
	- ne
	- nso
	- ny
	- or
	- pa
	- pt
	- rn
	- rw
	- sn
	- st
	- sw
	- ta
	- te
	- tn
	- ts
	- tum
	- tw
	- ur
	- vi
	- wo
	- xh
	- yo
	- zh
	- zu
	programming_language:
	- C
	- C++
	- C#
	- Go
	- Java
	- JavaScript
	- Lua
	- PHP
	- Python
	- Ruby
	- Rust
	- Scala
	- TypeScript
	pipeline_tag: text-generation
	widget:
	- text: "一个传奇的开端，一个不灭的神话，这不仅仅是一部电影，而是作为一个走进新时代的标签，永远彪炳史册。Would you rate the previous review as positive, neutral or negative?"
	example_title: "zh-en sentiment"
	- text: "一个传奇的开端，一个不灭的神话，这不仅仅是一部电影，而是作为一个走进新时代的标签，永远彪炳史册。你认为这句话的立场是赞扬、中立还是批评？"
	example_title: "zh-zh sentiment"
	- text: "Suggest at least five related search terms to \"Mạng neural nhân tạo\"."
	example_title: "vi-en query"
	- text: "Proposez au moins cinq mots clés concernant «Réseau de neurones artificiels»."
	example_title: "fr-fr query"
	- text: "Explain in a sentence in Telugu what is backpropagation in neural networks."
	example_title: "te-en qa"
	- text: "Why is the sky blue?"
	example_title: "en-en qa"
	- text: "Write a fairy tale about a troll saving a princess from a dangerous dragon. The fairy tale is a masterpiece that has achieved praise worldwide and its moral is \"Heroes Come in All Shapes and Sizes\". Story (in Spanish):"
	example_title: "es-en fable"
	- text: "Write a fable about wood elves living in a forest that is suddenly invaded by ogres. The fable is a masterpiece that has achieved praise worldwide and its moral is \"Violence is the last refuge of the incompetent\". Fable (in Hindi):"
	example_title: "hi-en fable"
	---

	Repository: [bigscience-workshop/bloomz](https://github.com/bigscience-workshop/bloomz)

	# Models

	Multilingual model capable of following user instructions in a variety of languages. Together with our paper [TODO: LINK], we release the following models:

	----

	- [bloomz](https://huggingface.co/bigscience/bloomz): 176B parameter multitask finetuned version of [bloom](https://huggingface.co/bigscience/bloom) on [xP3](https://huggingface.co/bigscience/xP3)
	- [bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1): 7.1B parameter multitask finetuned version of [bloom-7b1](https://huggingface.co/bigscience/bloom-7b1) on [xP3](https://huggingface.co/bigscience/xP3)
	- [bloomz-3b](https://huggingface.co/bigscience/bloomz-3b): 3B parameter multitask finetuned version of [bloom-3b](https://huggingface.co/bigscience/bloom-3b) on [xP3](https://huggingface.co/bigscience/xP3)
	- [bloomz-1b7](https://huggingface.co/bigscience/bloomz-1b7): 1.7B parameter multitask finetuned version of [bloom-1b7](https://huggingface.co/bigscience/bloom-1b7) on [xP3](https://huggingface.co/bigscience/xP3)
	- [bloomz-1b1](https://huggingface.co/bigscience/bloomz-1b1): 1.7B parameter multitask finetuned version of [bloom-1b1](https://huggingface.co/bigscience/bloom-1b1) on [xP3](https://huggingface.co/bigscience/xP3)
	- [bloomz-560m](https://huggingface.co/bigscience/bloomz-560m): 560M parameter multitask finetuned version of [bloom-560m](https://huggingface.co/bigscience/bloom-560m) on [xP3](https://huggingface.co/bigscience/xP3)

	----

	- [bloomz-mt](https://huggingface.co/bigscience/bloomz-mt): 176B parameter multitask finetuned version of [bloom](https://huggingface.co/bigscience/bloom) on [xP3](https://huggingface.co/bigscience/xP3) & [xP3mt](https://huggingface.co/bigscience/xP3). Better than [bloomz](https://huggingface.co/bigscience/bloomz) when prompting in non-english
	- [bloomz-7b1-mt](https://huggingface.co/bigscience/bloomz-7b1-mt): 7.1B parameter multitask finetuned version of [bloom-7b1](https://huggingface.co/bigscience/bloom-7b1) on [xP3](https://huggingface.co/bigscience/xP3) & [xP3mt](https://huggingface.co/bigscience/xP3). Better than [bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1) when prompting in non-english

	----

	- [bloomz-p3](https://huggingface.co/bigscience/bloomz): 176B parameter multitask finetuned version of [bloom](https://huggingface.co/bigscience/bloom) on [P3](https://huggingface.co/bigscience/P3). Released for research purposes, performance is inferior to [bloomz](https://huggingface.co/bigscience/bloomz)
	- [bloomz-7b1-p3](https://huggingface.co/bigscience/bloomz-7b1): 7.1B parameter multitask finetuned version of [bloom-7b1](https://huggingface.co/bigscience/bloom-7b1) on [P3](https://huggingface.co/bigscience/P3). Released for research purposes, performance is inferior to [bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1)

	----

	# Intended uses

	You can use the models to perform inference on tasks by specifying your query in natural language, and the models will generate a prediction. For instance, you can ask "Translate this to Chinese: Je t'aime.", and the model will hopefully generate "我爱你".

	# How to use

	Here is how to use the model in PyTorch:
	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("bigscience/bloomz-560m")
	model = AutoModelForCausalLM.from_pretrained("bigscience/bloomz-560m")

	inputs = tokenizer.encode("Is this review positive or negative? Review: this is the best cast iron skillet you will ever buy", return_tensors="pt")
	outputs = model.generate(inputs)
	print(tokenizer.decode(outputs[0]))
	```

	To use another checkpoint, replace the path in `AutoTokenizer` and `AutoModelForCausalLM`.

	Note: 176B models are trained with bfloat16, while smaller models are trained with fp16. We recommend using the same precision type or fp32 at inference

	# Limitations

	- Large model size may require large computational resources
	- High performance variance depending on the prompt

	# BibTeX entry and citation info

	```bibtex
	TODO
	```