--- license: other datasets: - facebook/belebele --- Pretrained toy models. Made with Andrej Karpathy's NanoGPT. # nano_35m * Trained late 2023 on part of Tagalog portion of Belebele. * batch_size = 64 * block_size = 256 * n_layer = 8 * n_head = 8 * n_embd = 768 * Everything else is left as is. # nano_76m * Trained January 2024 on part of Tagalog portion of Belebele. * batch_size = 64 * block_size = 256 * n_layer = 11 * n_head = 16 * n_embd = 768 * Everything else is left as is. # nano-ito_35m * Trained March 2024 on part of PALITO Tagalog dataset. * batch_size = 64 * block_size = 256 * n_layer = 11 * n_head = 16 * n_embd = 512 * Everything else is left as is.