ybelkada commited on
Commit
f1238aa
1 Parent(s): e417c0d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +106 -11
README.md CHANGED
@@ -45,11 +45,99 @@ T5-Base is the checkpoint with 220 million parameters.
45
  - [Google's T5 Blog Post](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html)
46
  - [GitHub Repo](https://github.com/google-research/text-to-text-transfer-transformer)
47
  - [Hugging Face T5 Docs](https://huggingface.co/docs/transformers/model_doc/t5)
48
-
49
  # Usage
50
 
51
  Find below some example scripts on how to use the model in `transformers`:
52
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  # Uses
54
 
55
  ## Direct Use and Downstream Use
@@ -146,15 +234,22 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
146
  **BibTeX:**
147
 
148
  ```bibtex
149
- @article{2020t5,
150
- author = {Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu},
151
- title = {Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer},
152
- journal = {Journal of Machine Learning Research},
153
- year = {2020},
154
- volume = {21},
155
- number = {140},
156
- pages = {1-67},
157
- url = {http://jmlr.org/papers/v21/20-074.html}
 
 
 
 
 
 
 
158
  }
159
  ```
160
 
 
45
  - [Google's T5 Blog Post](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html)
46
  - [GitHub Repo](https://github.com/google-research/text-to-text-transfer-transformer)
47
  - [Hugging Face T5 Docs](https://huggingface.co/docs/transformers/model_doc/t5)
48
+
49
  # Usage
50
 
51
  Find below some example scripts on how to use the model in `transformers`:
52
+
53
+ ## Using the Pytorch model
54
+
55
+ ### Running the model on the CPU
56
+
57
+ <details>
58
+ <summary> Click to expand </summary>
59
+
60
+ ```python
61
+
62
+ from transformers import T5Tokenizer, T5ForConditionalGeneration
63
+
64
+ tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-large")
65
+ model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-large")
66
+
67
+ input_text = "translate English to German: How old are you?"
68
+ input_ids = tokenizer.encode(input_text, return_tensors="pt").input_ids
69
+
70
+ outputs = model.generate(input_ids)
71
+ print(tokenizer.decode(outputs[0]))
72
+ ```
73
+
74
+ </details>
75
+
76
+ ### Running the model on a GPU
77
+
78
+ <details>
79
+ <summary> Click to expand </summary>
80
+
81
+ ```python
82
+
83
+ from transformers import T5Tokenizer, T5ForConditionalGeneration
84
+
85
+ tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-large")
86
+ model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-large", device_map="auto")
87
+
88
+ input_text = "translate English to German: How old are you?"
89
+ input_ids = tokenizer.encode(input_text, return_tensors="pt").input_ids.to("cuda")
90
+
91
+ outputs = model.generate(input_ids)
92
+ print(tokenizer.decode(outputs[0]))
93
+ ```
94
+
95
+ </details>
96
+
97
+ ### Running the model on a GPU using different precisions
98
+
99
+ #### FP16
100
+
101
+ <details>
102
+ <summary> Click to expand </summary>
103
+
104
+ ```python
105
+ import torch
106
+ from transformers import T5Tokenizer, T5ForConditionalGeneration
107
+
108
+ tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-large")
109
+ model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-large", device_map="auto", torch_dtype=torch.float16)
110
+
111
+ input_text = "translate English to German: How old are you?"
112
+ input_ids = tokenizer.encode(input_text, return_tensors="pt").input_ids.to("cuda")
113
+
114
+ outputs = model.generate(input_ids)
115
+ print(tokenizer.decode(outputs[0]))
116
+ ```
117
+
118
+ </details>
119
+
120
+ #### INT8
121
+
122
+ <details>
123
+ <summary> Click to expand </summary>
124
+
125
+ ```python
126
+ # pip install bistandbytes
127
+ from transformers import T5Tokenizer, T5ForConditionalGeneration
128
+
129
+ tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-large")
130
+ model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-large", device_map="auto", load_in_8bit=True)
131
+
132
+ input_text = "translate English to German: How old are you?"
133
+ input_ids = tokenizer.encode(input_text, return_tensors="pt").input_ids.to("cuda")
134
+
135
+ outputs = model.generate(input_ids)
136
+ print(tokenizer.decode(outputs[0]))
137
+ ```
138
+
139
+ </details>
140
+
141
  # Uses
142
 
143
  ## Direct Use and Downstream Use
 
234
  **BibTeX:**
235
 
236
  ```bibtex
237
+ @misc{https://doi.org/10.48550/arxiv.2210.11416,
238
+ doi = {10.48550/ARXIV.2210.11416},
239
+
240
+ url = {https://arxiv.org/abs/2210.11416},
241
+
242
+ author = {Chung, Hyung Won and Hou, Le and Longpre, Shayne and Zoph, Barret and Tay, Yi and Fedus, William and Li, Eric and Wang, Xuezhi and Dehghani, Mostafa and Brahma, Siddhartha and Webson, Albert and Gu, Shixiang Shane and Dai, Zhuyun and Suzgun, Mirac and Chen, Xinyun and Chowdhery, Aakanksha and Narang, Sharan and Mishra, Gaurav and Yu, Adams and Zhao, Vincent and Huang, Yanping and Dai, Andrew and Yu, Hongkun and Petrov, Slav and Chi, Ed H. and Dean, Jeff and Devlin, Jacob and Roberts, Adam and Zhou, Denny and Le, Quoc V. and Wei, Jason},
243
+
244
+ keywords = {Machine Learning (cs.LG), Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
245
+
246
+ title = {Scaling Instruction-Finetuned Language Models},
247
+
248
+ publisher = {arXiv},
249
+
250
+ year = {2022},
251
+
252
+ copyright = {Creative Commons Attribution 4.0 International}
253
  }
254
  ```
255