Update README.md
Browse files
README.md
CHANGED
@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
13 |
|
14 |
# GPT-2_para3M
|
15 |
|
16 |
-
This model is a
|
17 |
It achieves the following results on the evaluation set:
|
18 |
- Loss: 2.3207
|
19 |
|
@@ -23,8 +23,9 @@ More information needed
|
|
23 |
|
24 |
## Intended uses & limitations
|
25 |
|
26 |
-
|
27 |
-
|
|
|
28 |
## Training and evaluation data
|
29 |
|
30 |
More information needed
|
|
|
13 |
|
14 |
# GPT-2_para3M
|
15 |
|
16 |
+
This model is a pretrained version of [gpt2](https://huggingface.co/gpt2) on an [Tinystory](https://huggingface.co/datasets/roneneldan/TinyStories) dataset.
|
17 |
It achieves the following results on the evaluation set:
|
18 |
- Loss: 2.3207
|
19 |
|
|
|
23 |
|
24 |
## Intended uses & limitations
|
25 |
|
26 |
+
The limitation of this model are mainly 2 aspects.
|
27 |
+
* The number of parameter of the model is only around 3.6 million which is not large. As a result the model cannot generate text in all perspectives.
|
28 |
+
* The dataset is only composed of stories, this greatly hinder the performance of the model. Only stories can be generated.
|
29 |
## Training and evaluation data
|
30 |
|
31 |
More information needed
|