huseinzol05's picture
Update README.md
adad8d8 verified
|
raw
history blame
699 Bytes
metadata
language:
  - ms
  - en
  - zh
  - ta

Malaysian SmolLM2-360M Instruct

Continue finetuning https://huggingface.co/HuggingFaceTB/SmolLM2-360M on highly curated 1.5B tokens Malaysian instruction dataset.

Improvement

  1. Support respond in Manglish, Mandarin, Tamil, Jawi, Johor, Kedah, Kelantan, Pahang, Perak, Sabah, Sarawak, Selangor, Negeri Sembilan and Terengganu.
  2. Able to code in Manglish, Mandarin, Tamil, Jawi, Johor, Kedah, Kelantan, Pahang, Perak, Sabah, Sarawak, Selangor, Negeri Sembilan and Terengganu.
  3. Multi-turn Malaysian context such as related to Malaysian Legislation, politics, religions and languages.
  4. Malaysian role-playing.
  5. Standard RAG.

Still on training.