library_name: transformers
tags:
- indonesia
license: mit
language:
- id
inference: true
How small can language models be?
Sasando-1 is a tiny, highly experimental text generator built using the Phi-3 architecture.
❕Go straight to the gradio demo❕
This repo contains the 25M version.
🇮🇩 Context
Indonesia has +700 languages, and many of them are dying at an alarming rate. Language technologies like generative AI can play a massive role in language preservation. However, Indonesia has several contextual issues:
- Many languages, including those with millions of speakers, have low-volume digital resources
- Running large models can be costly, while Indonesia is a middle-income country with little funding
Overcoming these challenges require developers to work with what little data and money that they have. Sasando-1 is a prototypical demonstration that thinly-available resources can potentially still be leveraged to develop generative models with cheap compute.
✨ Specs
- Comes with 7M and 25M parameters
- Based on Phi-3 architecture
- Embedding vocab 4096
- Trained on ~257M tokens * 4 epoch
🔭 Out-of-Scope Use
This is a research preview base model. It is not intruction-tuned and has minimal safety curation. It is not intended for commercial or practical applications.
You are also not allowed to use this model without having fun.
Acknowledgments
- Developed by: Afrizal Hasbi Azizy
- License: MIT