Edit model card

NanoT5 Small Malaysian Translation V2.1

Finetuned https://huggingface.co/mesolitica/nanot5-small-malaysian-cased using 2048 context length on 9B tokens of translation dataset.

  • This model able to translate from localize text into standard text.
  • This model able to reverse translate from standard to localize text, suitable for text augmentation.
  • This model able to translate code.
  • This model natively code switching.
  • This model should maintain \n, \t, \r as it is.
  • Better Science and Math context translation compared to V2.
  • Better Manglish translation compared to V2.
  • Better Cantonese translation compared to V2.
  • Better Tamil and Tanglish translation compared to V2.

Wandb at https://wandb.ai/huseinzol05/nanot5-small-malaysian-cased-translation-v5-multipack-post, still on training

Downloads last month
28
Safetensors
Model size
89.5M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for mesolitica/nanot5-small-malaysian-translation-v2.1

Finetuned
(2)
this model