MBZUAI
/

LaMini-Flan-T5-77M

Text2Text Generation

Generated from Trainer

instruction fine-tuning

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

chiyuzhang commited on Apr 24, 2023

Commit

22e8baa

•

1 Parent(s): c40ca08

Update README.md

Files changed (1) hide show

README.md +20 -15

README.md CHANGED Viewed

@@ -18,14 +18,10 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on [LaMini dataset]() that contains 2.58M samples for instruction fine-tuning. For more information about our dataset, please refer to our [project repository]().
-## Model description
 We initialize with [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) and fine-tune it on our [LaMini dataset](). Its total number of parameters is 61M.
-## Training procedure
-### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 0.0005
@@ -38,10 +34,10 @@ The following hyperparameters were used during training:
 - lr_scheduler_type: linear
 - num_epochs: 5
-## Training and evaluation data
 We conducted two sets of evaluations: automatic evaluation on downstream NLP tasks and human evaluation on user-oriented instructions. For more detail, please refer to our [paper]().
-## Model Models
 You can download LaMini model series as follow. Note that not all models are performing as well. More details can be seen in our [paper]().
 <details>
 <summary> Click to expand </summary>
@@ -130,6 +126,10 @@ You can download LaMini model series as follow. Note that not all models are per
 ## Use
 ### CPU
 <details>
@@ -172,14 +172,19 @@ print("Response": generated_text)
 </details>
-## Intended uses & limitations
 More information needed
-### Framework versions
-- Transformers 4.27.0
-- Pytorch 2.0.0+cu117
-- Datasets 2.2.0
-- Tokenizers 0.13.2

 This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on [LaMini dataset]() that contains 2.58M samples for instruction fine-tuning. For more information about our dataset, please refer to our [project repository]().
+## Training Procedure
 We initialize with [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) and fine-tune it on our [LaMini dataset](). Its total number of parameters is 61M.
+### Training Hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 0.0005
 - lr_scheduler_type: linear
 - num_epochs: 5
+## Evaluation
 We conducted two sets of evaluations: automatic evaluation on downstream NLP tasks and human evaluation on user-oriented instructions. For more detail, please refer to our [paper]().
+## More Models
 You can download LaMini model series as follow. Note that not all models are performing as well. More details can be seen in our [paper]().
 <details>
 <summary> Click to expand </summary>
 ## Use
+### Intended use
+We now show you how to load and use our model using HuggingFace `pipline()`.
 ### CPU
 <details>
 </details>
+## Limitations
 More information needed
+# Citation
+```bibtex
+@misc{,
+      title={LaMini: Distilling Knowledge from Large Language Models},
+      author={},
+      year={2023},
+      eprint={},
+      archivePrefix={},
+      primaryClass={}
+}
+```