Update README.md
Browse files
README.md
CHANGED
@@ -45,11 +45,99 @@ T5-Base is the checkpoint with 220 million parameters.
|
|
45 |
- [Google's T5 Blog Post](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html)
|
46 |
- [GitHub Repo](https://github.com/google-research/text-to-text-transfer-transformer)
|
47 |
- [Hugging Face T5 Docs](https://huggingface.co/docs/transformers/model_doc/t5)
|
48 |
-
|
49 |
# Usage
|
50 |
|
51 |
Find below some example scripts on how to use the model in `transformers`:
|
52 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
53 |
# Uses
|
54 |
|
55 |
## Direct Use and Downstream Use
|
@@ -146,15 +234,22 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
|
|
146 |
**BibTeX:**
|
147 |
|
148 |
```bibtex
|
149 |
-
@
|
150 |
-
|
151 |
-
|
152 |
-
|
153 |
-
|
154 |
-
|
155 |
-
|
156 |
-
|
157 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
158 |
}
|
159 |
```
|
160 |
|
|
|
45 |
- [Google's T5 Blog Post](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html)
|
46 |
- [GitHub Repo](https://github.com/google-research/text-to-text-transfer-transformer)
|
47 |
- [Hugging Face T5 Docs](https://huggingface.co/docs/transformers/model_doc/t5)
|
48 |
+
|
49 |
# Usage
|
50 |
|
51 |
Find below some example scripts on how to use the model in `transformers`:
|
52 |
+
|
53 |
+
## Using the Pytorch model
|
54 |
+
|
55 |
+
### Running the model on the CPU
|
56 |
+
|
57 |
+
<details>
|
58 |
+
<summary> Click to expand </summary>
|
59 |
+
|
60 |
+
```python
|
61 |
+
|
62 |
+
from transformers import T5Tokenizer, T5ForConditionalGeneration
|
63 |
+
|
64 |
+
tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-large")
|
65 |
+
model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-large")
|
66 |
+
|
67 |
+
input_text = "translate English to German: How old are you?"
|
68 |
+
input_ids = tokenizer.encode(input_text, return_tensors="pt").input_ids
|
69 |
+
|
70 |
+
outputs = model.generate(input_ids)
|
71 |
+
print(tokenizer.decode(outputs[0]))
|
72 |
+
```
|
73 |
+
|
74 |
+
</details>
|
75 |
+
|
76 |
+
### Running the model on a GPU
|
77 |
+
|
78 |
+
<details>
|
79 |
+
<summary> Click to expand </summary>
|
80 |
+
|
81 |
+
```python
|
82 |
+
|
83 |
+
from transformers import T5Tokenizer, T5ForConditionalGeneration
|
84 |
+
|
85 |
+
tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-large")
|
86 |
+
model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-large", device_map="auto")
|
87 |
+
|
88 |
+
input_text = "translate English to German: How old are you?"
|
89 |
+
input_ids = tokenizer.encode(input_text, return_tensors="pt").input_ids.to("cuda")
|
90 |
+
|
91 |
+
outputs = model.generate(input_ids)
|
92 |
+
print(tokenizer.decode(outputs[0]))
|
93 |
+
```
|
94 |
+
|
95 |
+
</details>
|
96 |
+
|
97 |
+
### Running the model on a GPU using different precisions
|
98 |
+
|
99 |
+
#### FP16
|
100 |
+
|
101 |
+
<details>
|
102 |
+
<summary> Click to expand </summary>
|
103 |
+
|
104 |
+
```python
|
105 |
+
import torch
|
106 |
+
from transformers import T5Tokenizer, T5ForConditionalGeneration
|
107 |
+
|
108 |
+
tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-large")
|
109 |
+
model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-large", device_map="auto", torch_dtype=torch.float16)
|
110 |
+
|
111 |
+
input_text = "translate English to German: How old are you?"
|
112 |
+
input_ids = tokenizer.encode(input_text, return_tensors="pt").input_ids.to("cuda")
|
113 |
+
|
114 |
+
outputs = model.generate(input_ids)
|
115 |
+
print(tokenizer.decode(outputs[0]))
|
116 |
+
```
|
117 |
+
|
118 |
+
</details>
|
119 |
+
|
120 |
+
#### INT8
|
121 |
+
|
122 |
+
<details>
|
123 |
+
<summary> Click to expand </summary>
|
124 |
+
|
125 |
+
```python
|
126 |
+
# pip install bistandbytes
|
127 |
+
from transformers import T5Tokenizer, T5ForConditionalGeneration
|
128 |
+
|
129 |
+
tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-large")
|
130 |
+
model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-large", device_map="auto", load_in_8bit=True)
|
131 |
+
|
132 |
+
input_text = "translate English to German: How old are you?"
|
133 |
+
input_ids = tokenizer.encode(input_text, return_tensors="pt").input_ids.to("cuda")
|
134 |
+
|
135 |
+
outputs = model.generate(input_ids)
|
136 |
+
print(tokenizer.decode(outputs[0]))
|
137 |
+
```
|
138 |
+
|
139 |
+
</details>
|
140 |
+
|
141 |
# Uses
|
142 |
|
143 |
## Direct Use and Downstream Use
|
|
|
234 |
**BibTeX:**
|
235 |
|
236 |
```bibtex
|
237 |
+
@misc{https://doi.org/10.48550/arxiv.2210.11416,
|
238 |
+
doi = {10.48550/ARXIV.2210.11416},
|
239 |
+
|
240 |
+
url = {https://arxiv.org/abs/2210.11416},
|
241 |
+
|
242 |
+
author = {Chung, Hyung Won and Hou, Le and Longpre, Shayne and Zoph, Barret and Tay, Yi and Fedus, William and Li, Eric and Wang, Xuezhi and Dehghani, Mostafa and Brahma, Siddhartha and Webson, Albert and Gu, Shixiang Shane and Dai, Zhuyun and Suzgun, Mirac and Chen, Xinyun and Chowdhery, Aakanksha and Narang, Sharan and Mishra, Gaurav and Yu, Adams and Zhao, Vincent and Huang, Yanping and Dai, Andrew and Yu, Hongkun and Petrov, Slav and Chi, Ed H. and Dean, Jeff and Devlin, Jacob and Roberts, Adam and Zhou, Denny and Le, Quoc V. and Wei, Jason},
|
243 |
+
|
244 |
+
keywords = {Machine Learning (cs.LG), Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
|
245 |
+
|
246 |
+
title = {Scaling Instruction-Finetuned Language Models},
|
247 |
+
|
248 |
+
publisher = {arXiv},
|
249 |
+
|
250 |
+
year = {2022},
|
251 |
+
|
252 |
+
copyright = {Creative Commons Attribution 4.0 International}
|
253 |
}
|
254 |
```
|
255 |
|