gpt2-small-amharic-8k-128-v3

This is a smaller version of the gpt2 decoder transformer model pretrained from scratch for 1.5 days on 290 million tokens of Amharic text.

It has 29.5 Million parameters
The context size of this model is 128 tokens.
It has the same tokenizer as gpt2, trained from scratch using the same dataset with a vocabulary size of 8192.
This is a base model and hasn't undergone any supervised finetuing yet.

It achieves the following results on the evaluation set:

Loss: 3.59
Perplexity: 36.23

Demo

You can use the following demo to generate text using gpt2-small-amharic. Please enter a prompt and click the Generate button to generate completions for the prompt.

https://huggingface.co/spaces/rasyosef/GPT2-Amharic

rasyosef
/

gpt2-small-amharic-8k-128-v3

gpt2-small-amharic-8k-128-v3

Demo

Collection including rasyosef/gpt2-small-amharic-8k-128-v3

Amharic GPT2