mesolitica
/

nanot5-small-malaysian-translation-v2

text2text-generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

huseinzol05 commited on Oct 20, 2024

Commit

9886323

·

verified ·

1 Parent(s): 1612feb

Update README.md

Files changed (1) hide show

README.md +4 -3

README.md CHANGED Viewed

@@ -17,10 +17,11 @@ Wandb at https://wandb.ai/huseinzol05/nanot5-small-malaysian-cased-translation-v
 ## how we trained it?
-We done 2 phases,
-1. First phase, trained on 6B tokens noisy translation dataset.
-2. Second phase, trained on 1B tokens higher quality translation dataset.
 ## Supported prefix

 ## how we trained it?
+We done 3 phases,
+1. First phase, trained on 5% of the 6B tokens noisy translation dataset that include all prefixes on padding based training to improve attention bias.
+1. Second phase, trained on 6B tokens noisy translation dataset on packing based and this required to freeze attention bias to speed up the training.
+2. Third phase, trained on 1B tokens higher quality translation dataset on packing based and this required to freeze attention bias to speed up the training.
 ## Supported prefix