How many epochs did you train on code_bagel?

#2
by rombodawg - opened

Just curious did you train over the whole dataset? and how many epochs? And is this a full finetune or Lora?

Owner

Whole dataset. took a few days to train him.
With UNA, only 1 epoch is needed. Results of more epochs is not better.

Gotcha, I'm not super familiar with UNA, I'm currently training with Qlora, and I've found that 5-6 epochs is needed to get best results. Most people were doing only 3, and after some trail and error, I learned that the low number of epochs was causing the models to be much lower quality.

Oh one more thing, since my dataset is based on coding, can we get a humaneval benchmark? Bigcodes eval repository is the easiest way to set it up

Sign up or log in to comment