Text Generation
Safetensors
English
llama
conversational

Repetitive Thinking

#2
by Zhuohan - opened

Hi, thank you so much for the model.
While I am running on my test cases, it looks like the model will repeat the thinking process until it runs of the maximum token instead of finishing after the final response, I wonder if this is intended or how should I set a reasonable max_new_tokens to save the time of reasoning, and make sure it at least complete one Thinking\n\n ## Final Response\n\n iteration?
Thank you in advance

Sign up or log in to comment