This is a GPT-2 model trained for 330K steps from scratch (of 1M batch size) on FineWeb-EDU i.e around 300B Tokens.
Model Description
This is the model card of a 🤗 transformers model that has been pushed on the Hub..
Developed by: Ameer H Shared by [optional]: Ameer H Model type: GPT2 Language(s) (NLP): English License: MIT
Bias, Risks, and Limitations
Will produce blabbers and unintended slurs racial or anything. Do not blame this is just an experiment.
Forked from Andrej Karparthy's original model.
- Downloads last month
- 150
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.