Intermediate checkpoints for research purposes

#11
by maveriq - opened

Hi!

I was wondering if you will share intermediate checkpoints to enable comparison with models trained on less tokens on the same dataset but with different architectures? Checkpoints at 10B-50B will be especially useful for those with limited compute resources.

Thanks!

Sign up or log in to comment