Update README.md
Browse files
README.md
CHANGED
@@ -33,11 +33,10 @@ inference: true
|
|
33 |
<p><strong><a href="https://huggingface.co/spaces/afrizalha/Sasando-1" style="color: blue; font-family: Tahoma;">❕Go straight to the gradio demo❕</a></strong></p>
|
34 |
<p><em style="color: black; font-weight: bold;">This repo contains the 25M version.</em></p>
|
35 |
</center>
|
36 |
-
|
37 |
-
### 🎻 Welcome!
|
38 |
Sasando-1 is a tiny, highly experimental Indonesian text generator built using the Phi-3 architecture. It comes with two variations of microscopic sizes: 7M and 25M parameters. It is trained on a tightly-controlled Indo4B dataset filtered to only have 18000 unique words. The method is inspired by Microsoft's TinyStories paper which demonstrates that a tiny language model can produce fluent text when trained on tightly-controlled dataset.
|
39 |
|
40 |
-
|
41 |
Indonesia has +700 languages, and many of them are dying at an alarming rate. Language technologies like generative AI can play a massive role in language preservation. However, Indonesia has several contextual issues:
|
42 |
|
43 |
- Many languages, including those with millions of speakers, have low-volume digital resources
|
@@ -45,18 +44,18 @@ Indonesia has +700 languages, and many of them are dying at an alarming rate. La
|
|
45 |
|
46 |
Overcoming these challenges require developers to work with what little data and money that they have. Sasando-1 is a prototypical demonstration that thinly-available resources can potentially still be leveraged to develop generative models with cheap compute.
|
47 |
|
48 |
-
|
49 |
- Comes with 7M and 25M parameters
|
50 |
- Based on Phi-3 architecture
|
51 |
- Embedding vocab 4096
|
52 |
- Trained on ~257M tokens * 4 epoch
|
53 |
|
54 |
-
|
55 |
This is a research preview base model. It is not intruction-tuned and has minimal safety curation. It is not intended for commercial or practical applications.
|
56 |
|
57 |
You are also not allowed to use this model without having fun.
|
58 |
|
59 |
-
|
60 |
|
61 |
- **Developed by:** Afrizal Hasbi Azizy
|
62 |
- **License:** MIT
|
|
|
33 |
<p><strong><a href="https://huggingface.co/spaces/afrizalha/Sasando-1" style="color: blue; font-family: Tahoma;">❕Go straight to the gradio demo❕</a></strong></p>
|
34 |
<p><em style="color: black; font-weight: bold;">This repo contains the 25M version.</em></p>
|
35 |
</center>
|
36 |
+
## 🎻 Welcome!
|
|
|
37 |
Sasando-1 is a tiny, highly experimental Indonesian text generator built using the Phi-3 architecture. It comes with two variations of microscopic sizes: 7M and 25M parameters. It is trained on a tightly-controlled Indo4B dataset filtered to only have 18000 unique words. The method is inspired by Microsoft's TinyStories paper which demonstrates that a tiny language model can produce fluent text when trained on tightly-controlled dataset.
|
38 |
|
39 |
+
## 🇮🇩 Context
|
40 |
Indonesia has +700 languages, and many of them are dying at an alarming rate. Language technologies like generative AI can play a massive role in language preservation. However, Indonesia has several contextual issues:
|
41 |
|
42 |
- Many languages, including those with millions of speakers, have low-volume digital resources
|
|
|
44 |
|
45 |
Overcoming these challenges require developers to work with what little data and money that they have. Sasando-1 is a prototypical demonstration that thinly-available resources can potentially still be leveraged to develop generative models with cheap compute.
|
46 |
|
47 |
+
## ✨ Specs
|
48 |
- Comes with 7M and 25M parameters
|
49 |
- Based on Phi-3 architecture
|
50 |
- Embedding vocab 4096
|
51 |
- Trained on ~257M tokens * 4 epoch
|
52 |
|
53 |
+
## 🔭 Out-of-Scope Use
|
54 |
This is a research preview base model. It is not intruction-tuned and has minimal safety curation. It is not intended for commercial or practical applications.
|
55 |
|
56 |
You are also not allowed to use this model without having fun.
|
57 |
|
58 |
+
## Acknowledgments
|
59 |
|
60 |
- **Developed by:** Afrizal Hasbi Azizy
|
61 |
- **License:** MIT
|