Model code for training from sractch

#6
by Chrisneverdie - opened

Hi there,

Thank you so much for the amazing work. I wonder if there is code and config available for us to train this model structure from scratch using our own dataset.

Thank you again!

Hugging Face TB Research org

Hi, we will release the code soon

Hi, we will release the code soon

Hi. Any updates?

Hugging Face TB Research org

Working on this, waiting for some PR to be merge and releasing it soon, for training we use nanotron https://github.com/huggingface/nanotron and here is a gist of the config we use in the meantime https://gist.github.com/eliebak/4263659706519536b7eebfe6d9815c60

Ah okay! Thanks for the quick response!

Sign up or log in to comment