Hallucination

#2
by cnmoro - opened

On prompts with context larger than 8k tokens, it simply starts outputting nonsense text.

I selected the raw text from a few books and tried to generate some summaries, I have tried all sorts of prompt techniques, but it does not answer correctly, it tries to continue the text of the book instead of answering my question, whatever it is. If under 8k tokens of context, it answers correctly.

I am using the updated tokenizer

Owner

I have found out just now: RoPE scaling has no effect here. The RoPE frequency base has to be inherently set to 160000 to accept this many tokens. The models will be updated.

Source: https://www.reddit.com/r/LocalLLaMA/s/g9SNNrfbpK

Owner

Please recheck with the correct RoPE technique applied.

The issue persists :(

Model weights will have be to updated, stay tuned while I figure out how to get LongLM to cooperate

This is a reference: https://www.reddit.com/r/LocalLLaMA/s/NTHnMabIDS

I don't have memory for this, however I am going to try.

Maybe this gonna work but I don't know

llama-cli -m <gguf> -c 32768 -n 1024 --temp 0 -gan 8 -gaw 4096

@cnmoro I changed some more things about the sliding window apart from applying the RoPE techniques

If you still get hallucinations, sorry I can't fix it due to lack of resources at hand.

This model will not be advertised as a high context size just to avoid any confusion that something is off.

Sign up or log in to comment