NanoT5 Small Malaysian Translation V2.1

Finetuned https://huggingface.co/mesolitica/nanot5-small-malaysian-cased using 2048 context length on 9B tokens of translation dataset.

This model able to translate from localize text into standard text.
This model able to reverse translate from standard to localize text, suitable for text augmentation.
This model able to translate code.
This model natively code switching.
This model should maintain \n, \t, \r as it is.
Better Science and Math context translation compared to V2.
Better Manglish translation compared to V2.
Better Cantonese translation compared to V2.
Better Tamil and Tanglish translation compared to V2.

mesolitica
/

nanot5-small-malaysian-translation-v2.1