---
library_name: transformers
tags:
- indonesia
license: mit
language:
- id
inference: true
---
Document Title
How small can language models be?
Sasando-1 is a tiny, highly experimental text generator built using the Phi-3 architecture.
❕Go straight to the gradio demo❕
This repo contains the 25M version.
### 🎻 Welcome!
Sasando-1 is a tiny, highly experimental Indonesian text generator built using the Phi-3 architecture. It comes with two variations of microscopic sizes: 7M and 25M parameters. It is trained on a tightly-controlled Indo4B dataset filtered to only have 18000 unique words. The method is inspired by Microsoft's TinyStories paper which demonstrates that a tiny language model can produce fluent text when trained on tightly-controlled dataset.
### ✨ Specs
- Comes with 7M and 25M parameters
- Based on Phi-3 architecture
- Embedding vocab 4096
- Trained on ~257M tokens * 4 epoch
### 🔭 Out-of-Scope Use
This is a research preview base model. It is not intruction-tuned and has minimal safety curation. It is not intended for commercial or practical applications.
You are also not allowed to use this model without having fun.
### Acknowledgments
- **Developed by:** Afrizal Hasbi Azizy
- **License:** MIT