Infinite repeating characters

#2
by MrNewman - opened

I'm using Q5_K_M with langchain's llamacpp. After about 25 words the output infinite loops on either a phrase word or character (the next file the next file the next file the next file, set set set set set set set, pythonnnnnnnnnnnnnnnnnnnnn)

Doesn't seem to be effected by context length, top_p, top_k, temperature. The Q8_0 behaves the same. Anyone else having this issue?

Yep the same here with q3 and q4 ...

@MrNewman @Dbone @TheBloke I had this problem as well. It appears that by default a poorly performing value for the rope scaling on this model is set. What did the trick for me in koboldcpp is adding the argument --ropeconfig 1.0 1000000

@sofuego tell me, please, how do I find out which ropeconfig parameter to set? If there is a problem on other models, I want to know how to choose the ropeconfig parameter

@sofuego Thanks mate, that also fixed the codellama-34b-instruct for me :D
Settings in localAI are:
rope_freq_base: 1000000
rope_freq_scale: 1.0

Sign up or log in to comment