chiyuzhang
commited on
Commit
•
22e8baa
1
Parent(s):
c40ca08
Update README.md
Browse files
README.md
CHANGED
@@ -18,14 +18,10 @@ should probably proofread and complete it, then remove this comment. -->
|
|
18 |
|
19 |
This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on [LaMini dataset]() that contains 2.58M samples for instruction fine-tuning. For more information about our dataset, please refer to our [project repository]().
|
20 |
|
21 |
-
##
|
22 |
-
|
23 |
We initialize with [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) and fine-tune it on our [LaMini dataset](). Its total number of parameters is 61M.
|
24 |
|
25 |
-
|
26 |
-
## Training procedure
|
27 |
-
|
28 |
-
### Training hyperparameters
|
29 |
|
30 |
The following hyperparameters were used during training:
|
31 |
- learning_rate: 0.0005
|
@@ -38,10 +34,10 @@ The following hyperparameters were used during training:
|
|
38 |
- lr_scheduler_type: linear
|
39 |
- num_epochs: 5
|
40 |
|
41 |
-
##
|
42 |
We conducted two sets of evaluations: automatic evaluation on downstream NLP tasks and human evaluation on user-oriented instructions. For more detail, please refer to our [paper]().
|
43 |
|
44 |
-
##
|
45 |
You can download LaMini model series as follow. Note that not all models are performing as well. More details can be seen in our [paper]().
|
46 |
<details>
|
47 |
<summary> Click to expand </summary>
|
@@ -130,6 +126,10 @@ You can download LaMini model series as follow. Note that not all models are per
|
|
130 |
|
131 |
## Use
|
132 |
|
|
|
|
|
|
|
|
|
133 |
### CPU
|
134 |
|
135 |
<details>
|
@@ -172,14 +172,19 @@ print("Response": generated_text)
|
|
172 |
|
173 |
</details>
|
174 |
|
175 |
-
##
|
176 |
|
177 |
More information needed
|
178 |
|
179 |
|
180 |
-
|
181 |
-
|
182 |
-
|
183 |
-
|
184 |
-
|
185 |
-
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on [LaMini dataset]() that contains 2.58M samples for instruction fine-tuning. For more information about our dataset, please refer to our [project repository]().
|
20 |
|
21 |
+
## Training Procedure
|
|
|
22 |
We initialize with [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) and fine-tune it on our [LaMini dataset](). Its total number of parameters is 61M.
|
23 |
|
24 |
+
### Training Hyperparameters
|
|
|
|
|
|
|
25 |
|
26 |
The following hyperparameters were used during training:
|
27 |
- learning_rate: 0.0005
|
|
|
34 |
- lr_scheduler_type: linear
|
35 |
- num_epochs: 5
|
36 |
|
37 |
+
## Evaluation
|
38 |
We conducted two sets of evaluations: automatic evaluation on downstream NLP tasks and human evaluation on user-oriented instructions. For more detail, please refer to our [paper]().
|
39 |
|
40 |
+
## More Models
|
41 |
You can download LaMini model series as follow. Note that not all models are performing as well. More details can be seen in our [paper]().
|
42 |
<details>
|
43 |
<summary> Click to expand </summary>
|
|
|
126 |
|
127 |
## Use
|
128 |
|
129 |
+
### Intended use
|
130 |
+
|
131 |
+
|
132 |
+
We now show you how to load and use our model using HuggingFace `pipline()`.
|
133 |
### CPU
|
134 |
|
135 |
<details>
|
|
|
172 |
|
173 |
</details>
|
174 |
|
175 |
+
## Limitations
|
176 |
|
177 |
More information needed
|
178 |
|
179 |
|
180 |
+
# Citation
|
181 |
+
```bibtex
|
182 |
+
@misc{,
|
183 |
+
title={LaMini: Distilling Knowledge from Large Language Models},
|
184 |
+
author={},
|
185 |
+
year={2023},
|
186 |
+
eprint={},
|
187 |
+
archivePrefix={},
|
188 |
+
primaryClass={}
|
189 |
+
}
|
190 |
+
```
|