afrizalha commited on
Commit
f280b45
1 Parent(s): 0607ca5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -6
README.md CHANGED
@@ -33,11 +33,10 @@ inference: true
33
  <p><strong><a href="https://huggingface.co/spaces/afrizalha/Sasando-1" style="color: blue; font-family: Tahoma;">❕Go straight to the gradio demo❕</a></strong></p>
34
  <p><em style="color: black; font-weight: bold;">This repo contains the 25M version.</em></p>
35
  </center>
36
-
37
- ### 🎻 Welcome!
38
  Sasando-1 is a tiny, highly experimental Indonesian text generator built using the Phi-3 architecture. It comes with two variations of microscopic sizes: 7M and 25M parameters. It is trained on a tightly-controlled Indo4B dataset filtered to only have 18000 unique words. The method is inspired by Microsoft's TinyStories paper which demonstrates that a tiny language model can produce fluent text when trained on tightly-controlled dataset.
39
 
40
- ### 🇮🇩 Context
41
  Indonesia has +700 languages, and many of them are dying at an alarming rate. Language technologies like generative AI can play a massive role in language preservation. However, Indonesia has several contextual issues:
42
 
43
  - Many languages, including those with millions of speakers, have low-volume digital resources
@@ -45,18 +44,18 @@ Indonesia has +700 languages, and many of them are dying at an alarming rate. La
45
 
46
  Overcoming these challenges require developers to work with what little data and money that they have. Sasando-1 is a prototypical demonstration that thinly-available resources can potentially still be leveraged to develop generative models with cheap compute.
47
 
48
- ### ✨ Specs
49
  - Comes with 7M and 25M parameters
50
  - Based on Phi-3 architecture
51
  - Embedding vocab 4096
52
  - Trained on ~257M tokens * 4 epoch
53
 
54
- ### 🔭 Out-of-Scope Use
55
  This is a research preview base model. It is not intruction-tuned and has minimal safety curation. It is not intended for commercial or practical applications.
56
 
57
  You are also not allowed to use this model without having fun.
58
 
59
- ### Acknowledgments
60
 
61
  - **Developed by:** Afrizal Hasbi Azizy
62
  - **License:** MIT
 
33
  <p><strong><a href="https://huggingface.co/spaces/afrizalha/Sasando-1" style="color: blue; font-family: Tahoma;">❕Go straight to the gradio demo❕</a></strong></p>
34
  <p><em style="color: black; font-weight: bold;">This repo contains the 25M version.</em></p>
35
  </center>
36
+ ## 🎻 Welcome!
 
37
  Sasando-1 is a tiny, highly experimental Indonesian text generator built using the Phi-3 architecture. It comes with two variations of microscopic sizes: 7M and 25M parameters. It is trained on a tightly-controlled Indo4B dataset filtered to only have 18000 unique words. The method is inspired by Microsoft's TinyStories paper which demonstrates that a tiny language model can produce fluent text when trained on tightly-controlled dataset.
38
 
39
+ ## 🇮🇩 Context
40
  Indonesia has +700 languages, and many of them are dying at an alarming rate. Language technologies like generative AI can play a massive role in language preservation. However, Indonesia has several contextual issues:
41
 
42
  - Many languages, including those with millions of speakers, have low-volume digital resources
 
44
 
45
  Overcoming these challenges require developers to work with what little data and money that they have. Sasando-1 is a prototypical demonstration that thinly-available resources can potentially still be leveraged to develop generative models with cheap compute.
46
 
47
+ ## ✨ Specs
48
  - Comes with 7M and 25M parameters
49
  - Based on Phi-3 architecture
50
  - Embedding vocab 4096
51
  - Trained on ~257M tokens * 4 epoch
52
 
53
+ ## 🔭 Out-of-Scope Use
54
  This is a research preview base model. It is not intruction-tuned and has minimal safety curation. It is not intended for commercial or practical applications.
55
 
56
  You are also not allowed to use this model without having fun.
57
 
58
+ ## Acknowledgments
59
 
60
  - **Developed by:** Afrizal Hasbi Azizy
61
  - **License:** MIT