Update README.md
Browse files
README.md
CHANGED
@@ -96,11 +96,48 @@ model = AutoModelForCausalLM.from_pretrained("supermy/poetry")
|
|
96 |
|
97 |
## Training procedure
|
98 |
|
99 |
-
|
100 |
-
|
101 |
-
|
102 |
-
|
103 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
104 |
|
105 |
|
106 |
```
|
|
|
96 |
|
97 |
## Training procedure
|
98 |
|
99 |
+
模型:[GPT2](https://huggingface.co/gpt2)
|
100 |
+
训练环境:英伟达16G显卡
|
101 |
+
|
102 |
+
bpe分词:"vocab_size"=50000
|
103 |
+
|
104 |
+
***** Running training *****
|
105 |
+
Num examples = 16431
|
106 |
+
Num Epochs = 680
|
107 |
+
Instantaneous batch size per device = 24
|
108 |
+
Total train batch size (w. parallel, distributed & accumulation) = 192
|
109 |
+
Gradient Accumulation steps = 8
|
110 |
+
Total optimization steps = 57800
|
111 |
+
Number of trainable parameters = 124242432
|
112 |
+
GPT-2 size: 124.2M parameters
|
113 |
+
0%| | 0/57800 [00:00<?, ?it/s]You're using a PreTrainedTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
|
114 |
+
9%|▊ | 5000/57800 [6:58:57<72:53:18, 4.97s/it]***** Running Evaluation *****
|
115 |
+
Num examples = 1755
|
116 |
+
Batch size = 24
|
117 |
+
{'loss': 3.1345, 'learning_rate': 0.0004939065828881268, 'epoch': 58.82}
|
118 |
+
9%|▊ | 5000/57800 [6:59:14<72:53:18, Saving model checkpoint to poetry-trainer/checkpoint-5000
|
119 |
+
Configuration saved in poetry-trainer/checkpoint-5000/config.json
|
120 |
+
Model weights saved in poetry-trainer/checkpoint-5000/pytorch_model.bin
|
121 |
+
tokenizer config file saved in poetry-trainer/checkpoint-5000/tokenizer_config.json
|
122 |
+
Special tokens file saved in poetry-trainer/checkpoint-5000/special_tokens_map.json
|
123 |
+
17%|█▋ | 10000/57800 [13:55:32<65:40:41, 4.95s/it]***** Running Evaluation *****
|
124 |
+
Num examples = 1755
|
125 |
+
Batch size = 24
|
126 |
+
{'eval_loss': 11.14090633392334, 'eval_runtime': 16.8326, 'eval_samples_per_second': 104.262, 'eval_steps_per_second': 4.396, 'epoch': 58.82}
|
127 |
+
{'loss': 0.2511, 'learning_rate': 0.00046966687938531824, 'epoch': 117.64}
|
128 |
+
17%|█▋ | 10000/57800 [13:55:48<65:40:41Saving model checkpoint to poetry-trainer/checkpoint-10000
|
129 |
+
..........
|
130 |
+
95%|█████████▌| 55000/57800 [76:06:46<3:59:33, 5.13s/it]***** Running Evaluation *****
|
131 |
+
Num examples = 1755
|
132 |
+
Batch size = 24
|
133 |
+
{'eval_loss': 14.860174179077148, 'eval_runtime': 16.7826, 'eval_samples_per_second': 104.572, 'eval_steps_per_second': 4.409, 'epoch': 588.23}
|
134 |
+
{'loss': 0.0083, 'learning_rate': 3.0262183266589473e-06, 'epoch': 647.06}
|
135 |
+
95%|█████████▌| 55000/57800 [76:07:03<3:59:33,Saving model checkpoint to poetry-trainer/checkpoint-55000
|
136 |
+
|
137 |
+
{'eval_loss': 14.830656051635742, 'eval_runtime': 16.7365, 'eval_samples_per_second': 104.86, 'eval_steps_per_second': 4.421, 'epoch': 647.06}
|
138 |
+
{'train_runtime': 287920.5857, 'train_samples_per_second': 38.806, 'train_steps_per_second': 0.201, 'train_loss': 0.33751299874592816, 'epoch': 679.99}
|
139 |
+
|
140 |
+
100%|██████████| 57800/57800 [79:58:40<00:00, 4.93s/it]
|
141 |
|
142 |
|
143 |
```
|