The llm output is incomplete

#11
by lijianqiang - opened

When using codeqwen1.5-7b-chat, the llm inference is often interrupted, and you have to say continue to output in order to continue to output the remaining results, and I have increased the max-token-len to no avail

Qwen org

any prompts for reproduction?

Sign up or log in to comment