Model code for training from sractch
#6
by
Chrisneverdie
- opened
Hi there,
Thank you so much for the amazing work. I wonder if there is code and config available for us to train this model structure from scratch using our own dataset.
Thank you again!
Hi, we will release the code soon
Hi, we will release the code soon
Hi. Any updates?
Working on this, waiting for some PR to be merge and releasing it soon, for training we use nanotron https://github.com/huggingface/nanotron and here is a gist of the config we use in the meantime https://gist.github.com/eliebak/4263659706519536b7eebfe6d9815c60
Ah okay! Thanks for the quick response!