Update README.md
Browse files
README.md
CHANGED
@@ -27,12 +27,13 @@ should probably proofread and complete it, then remove this comment. -->
|
|
27 |
# gpt2-shikoto
|
28 |
|
29 |
This model was trained on a dataset I obtained from an online novel site.
|
30 |
-
**Please be aware that the stories (training data) might contain inappropriate content
|
|
|
31 |
|
32 |
|
33 |
The base model can be found [here](https://huggingface.co/jed351/gpt2-tiny-zh-hk), which was obtained by
|
34 |
patching a [GPT2 Chinese model](https://huggingface.co/ckiplab/gpt2-tiny-chinese) and its tokenizer with Cantonese characters.
|
35 |
-
Refer to the base model for info
|
36 |
|
37 |
|
38 |
|
@@ -43,7 +44,7 @@ Please refer to the [script](https://github.com/huggingface/transformers/tree/ma
|
|
43 |
provided by Huggingface.
|
44 |
|
45 |
|
46 |
-
The model was trained for 400,000 steps on 2 NVIDIA Quadro RTX6000 for around 15 hours.
|
47 |
|
48 |
|
49 |
### Training hyperparameters
|
|
|
27 |
# gpt2-shikoto
|
28 |
|
29 |
This model was trained on a dataset I obtained from an online novel site.
|
30 |
+
**Please be aware that the stories (training data) might contain inappropriate content. This model is intended for research purposes only.**
|
31 |
+
|
32 |
|
33 |
|
34 |
The base model can be found [here](https://huggingface.co/jed351/gpt2-tiny-zh-hk), which was obtained by
|
35 |
patching a [GPT2 Chinese model](https://huggingface.co/ckiplab/gpt2-tiny-chinese) and its tokenizer with Cantonese characters.
|
36 |
+
Refer to the base model for info on the patching process.
|
37 |
|
38 |
|
39 |
|
|
|
44 |
provided by Huggingface.
|
45 |
|
46 |
|
47 |
+
The model was trained for 400,000 steps on 2 NVIDIA Quadro RTX6000 for around 15 hours at the Research Computing Services of Imperial College London.
|
48 |
|
49 |
|
50 |
### Training hyperparameters
|