Pretrained toy models. Made with Andrej Karpathy's NanoGPT.
nano_35m
- Trained late 2023 on part of Tagalog portion of Belebele.
- batch_size = 64
- block_size = 256
- n_layer = 8
- n_head = 8
- n_embd = 768
- Everything else is left as is.
nano_76m
- Trained January 2024 on part of Tagalog portion of Belebele.
- batch_size = 64
- block_size = 256
- n_layer = 11
- n_head = 16
- n_embd = 768
- Everything else is left as is.
nano-ito_35m
- Trained March 2024 on part of PALITO Tagalog dataset.
- batch_size = 64
- block_size = 256
- n_layer = 11
- n_head = 16
- n_embd = 512
- Everything else is left as is.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
HF Inference deployability: The model has no library tag.