Better coding dataset
#1
by
rombodawg
- opened
If you need a bigger dataset than codealpaca thats formatted in very similar way i have one made and you are free to use it.
link bellow
https://huggingface.co/datasets/rombodawg/MegaCodeTraining112k/tree/main
you rock @rombodawg will try it out
Let me know if you train a model with my dataset please! Ive been waiting to try that type of model, i just dont have the recourses to train one myself
@bwang0911 @samsja If you guys are interested I have made a version 3 of my megacode dataset and this one is the most promising one yet. Feel free to use to to train your future models:
https://huggingface.co/datasets/rombodawg/LosslessMegaCodeTrainingV3_2.2m_Evol