Update README.md
Browse files
README.md
CHANGED
@@ -68,7 +68,7 @@ tags:
|
|
68 |
datasets:
|
69 |
- svakulenk0/qrecc
|
70 |
- taskmaster2
|
71 |
-
- djaym7/wiki_dialog
|
72 |
- deepmind/code_contests
|
73 |
- lambada
|
74 |
- gsm8k
|
@@ -87,6 +87,7 @@ license: apache-2.0
|
|
87 |
|
88 |
# Table of Contents
|
89 |
|
|
|
90 |
1. [Model Details](#model-details)
|
91 |
2. [Usage](#usage)
|
92 |
3. [Uses](#uses)
|
@@ -96,28 +97,25 @@ license: apache-2.0
|
|
96 |
7. [Environmental Impact](#environmental-impact)
|
97 |
8. [Citation](#citation)
|
98 |
9. [Model Card Authors](#model-card-authors)
|
99 |
-
10. [How To Get Started With the Model](#how-to-get-started-with-the-model)
|
100 |
|
101 |
-
#
|
102 |
|
103 |
-
|
104 |
|
105 |
-
|
106 |
|
107 |
-
|
108 |
|
109 |
-
T5-Base is the checkpoint with 220 million parameters.
|
110 |
|
111 |
-
- **Developed by:** Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane, Gu Zhuyun, Dai Mirac, Suzgun Xinyun, Chen Aakanksha, Chowdhery Sharan, Narang Gaurav, Mishra Adams, Yu Vincent, Zhao Yanping, Huang Andrew, Dai Hongkun, Yu Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts; Denny Zhou, Quoc V. Le, Jason Wei∗ See [associated paper](https://arxiv.org/pdf/2210.11416.pdf) and [GitHub repo](https://github.com/google-research/t5x/blob/main/docs/models.md#flan-t5-checkpoints)
|
112 |
- **Model type:** Language model
|
113 |
-
- **Language(s) (NLP):** English, French, Romanian,
|
114 |
- **License:** Apache 2.0
|
115 |
-
- **Related Models:** [All T5 Checkpoints](https://huggingface.co/models?search=t5)
|
|
|
116 |
- **Resources for more information:**
|
117 |
-
- [Research paper](https://
|
118 |
-
- [
|
119 |
-
- [
|
120 |
-
- [Hugging Face T5 Docs](https://huggingface.co/docs/transformers/model_doc/t5)
|
121 |
|
122 |
# Usage
|
123 |
|
|
|
68 |
datasets:
|
69 |
- svakulenk0/qrecc
|
70 |
- taskmaster2
|
71 |
+
- djaym7/wiki_dialog
|
72 |
- deepmind/code_contests
|
73 |
- lambada
|
74 |
- gsm8k
|
|
|
87 |
|
88 |
# Table of Contents
|
89 |
|
90 |
+
0. [TL;DR](#TL;DR)
|
91 |
1. [Model Details](#model-details)
|
92 |
2. [Usage](#usage)
|
93 |
3. [Uses](#uses)
|
|
|
97 |
7. [Environmental Impact](#environmental-impact)
|
98 |
8. [Citation](#citation)
|
99 |
9. [Model Card Authors](#model-card-authors)
|
|
|
100 |
|
101 |
+
# TL;DR
|
102 |
|
103 |
+
> Flan-PaLM 540B achieves state-of-the-art performance on several benchmarks, such as 75.2% on five-shot MMLU. We also publicly release Flan-T5 checkpoints,1 which achieve strong few-shot performance even compared to much larger models, such as PaLM 62B. Overall, instruction finetuning is a general method for improving the performance and usability of pretrained language models.
|
104 |
|
105 |
+
# Model Details
|
106 |
|
107 |
+
## Model Description
|
108 |
|
|
|
109 |
|
|
|
110 |
- **Model type:** Language model
|
111 |
+
- **Language(s) (NLP):** English, Spanish, Japanese, Persian, Hindi, French, Chinese, Bengali, Gujarati, German, Telugu, Italian, Arabic, Polish, Tamil, Marathi, Malayalam, Oriya, Panjabi, Portuguese, Urdu, Galician, Hebrew, Korean, Catalan, Thai, Dutch, Indonesian, Vietnamese, Bulgarian, Filipino, Central Khmer, Lao, Turkish, Russian, Croatian, Swedish, Yoruba, Kurdish, Burmese, Malay, Czech, Finnish, Somali, Tagalog, Swahili, Sinhala, Kannada, Zhuang, Igbo, Xhosa, Romanian, Haitian, Estonian, Slovak, Lithuanian, Greek, Nepali, Assamese, Norwegian
|
112 |
- **License:** Apache 2.0
|
113 |
+
- **Related Models:** [All FLAN-T5 Checkpoints](https://huggingface.co/models?search=flan-t5)
|
114 |
+
- **Related Original Models:** [All Original FLAN-T5 Checkpoints](https://github.com/google-research/t5x/blob/main/docs/models.md#flan-t5-checkpoints)
|
115 |
- **Resources for more information:**
|
116 |
+
- [Research paper](https://arxiv.org/pdf/2210.11416.pdf)
|
117 |
+
- [GitHub Repo](https://github.com/google-research/t5x)
|
118 |
+
- [Hugging Face FLAN-T5 Docs (Similar to T5) ](https://huggingface.co/docs/transformers/model_doc/t5)
|
|
|
119 |
|
120 |
# Usage
|
121 |
|