Link to code repository

#3
by ewre324 - opened

Hello, I was wondering if the authors would be open sourcing the code for training from scratch and the training dataset?

BEEspoke Data org

Hi! Unfortunately we don’t have a repo for this but the pretraining code was quite literally a slightly modified version of https://github.com/huggingface/transformers/blob/main/examples/pytorch/language-modeling/README.md

As for datasets, it’s the ones listed + others on our page for the ones I’m able to share at the moment

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment