Model does not stop generating new tokens.
I have followed the guide https://huggingface.co/blog/mlabonne/orpo-llama-3 (and the Colab notebook) for fine-tuning Mistral-7B-v0.3 on 2.5k subsamples. However, the model does not stop generating new tokens. I tried adding 'eos_token_id=tokenizer.eos_token_id' to signal the model to the end of a sequence. That didn't work either. Any clue?
Here is the fine-tuned model: https://huggingface.co/MuntasirHossain/Orpo-Mistral-7B-v0.3.
I checked your tokenizer config and everything is correct. I think you might want to train the model on more tokens so it correctly learns to output the EOS token (2.5k is quite small).
Thank you! I thought about that. I tested the model you fine-tuned with only 1K samples. That worked fine, no issues with stopping generation. So I thought 2.5k would be good enough for the demo. But then was a bit surprised with the issue!
I think I had the same issue with the version trained on 1K samples. The current version has been trained on the full dataset (but just 1 epoch I believe)
Oh I see! Your model card says it was fine-tuned on 1k samples (you might want to update!) so I didn't want to start with a large sample size for a demo.
Btw, thanks again for the excellent guide on ORPO.
You're right, just updated it. Thanks and good luck!