harry-GPTter
harry-GPTter is a transformer text generation model implemented in PyTorch. It has been trained on text from all 7 books from from all 7 books of the Harry Potter series. In only 10 minutes of training with the free tier of Google Colaboratory, the model learnt to generate coherent and grammatically correct sentences.
- Code and more information in the GitHub Repository
- Download the weights
Text Generation with harry-GPTter
“Ah,” said Mrs. Weasley, hiscolored lips looking unpleasant. “He wasn’t talking about her, he has tried to think he was saying he had looked up. The bleers were flooding.”
“My master died?” whispered Voldemort, but the wasnoddenbling until he are, making to be seeing him.
“I’ll see you, Professor Lockhart,” said Hermione, “but so surely now to have solid on it out of her whole bed! You’re thinking —
“Oh hello the unconscious!”
“And now blimey,” said Harry, “it was a very serious for an enormous mother. ...”
Model Details
harry-GPTter is a relatively small language model with 56M parameters (less than 1/2x of smallest gpt-2). It contains 8 layers of 8 headed attention with a hidden size of 384. It supports a maximum sequence length of 128. For tokenization, we use the same tokenizer as text-davinci-003, which has a vocabulary of 50,280 in total.
The model was trained for 2000 epochs in about 10 minutes with the free tier of Google Colab GPU Runtime. It achieves a cross-entropy loss of 3.1189.
This model was built for learning purposes. You can probably get better performance by finetuning a pre-trained model.