Assistance with RoPE

#5
by UnspokenProtocol - opened

Hello, I apologize if this is a silly question, but what are the correct values to use for RoPE with this model? Every time I enable RoPE, even if I leave the scale at 1.0 and the context at 8k, it spits out gibberish. I saw in the description that up to 32K is acceptable. My goal is to increase the context size from 8K to 16K.

I have tried setting the scale to 0.5 and 2.0, as double 8 is 16.
I have used NotebookLM to understand more about RoPE from the 3 PDFs on RoPE here (https://huggingface.co/collections/stereoplegic/rope-6542b286bf8a6039fbcd4ded).
I have followed your configuration guide (which works perfectly when RoPE isn't in use).
I have asked Gemini, ChatGPT, MS Copilot, and Meta AI (which all gave different answers, but I tried them all anyway).
I tried googling for a math formula or a calculator to find the correct values, but I was unable to find one.
I even tried scaling factor within the range of 1.0 (no interpolation) to s * 1.25 as the extension ratio.

I am new to using LLMs (just over a week) and normally I'd just use one that was trained on a 16k or 32k context but none compare to this.

I can't think of any additional interventions other than to ask here. Even pushing me in the right direction would be deeply appreciated.

Rope is usually set as a "fraction" -> IE .5 is 2 times the context length (.25 would be 4 times).
Depending on your ai app you may need to set / change context length too - to match.

Here is info on rope ; settings for various apps (scroll to very bottom of the page):
https://huggingface.co/DavidAU/TieFighter-Holodeck-Holomax-Mythomax-F1-V1-COMPOS-20B-gguf

NOTE:
Do NOT use "context shift" and/or "Flash attention" with rope. (You can actually use flash attention in lieu of rope)

Also; when you use ROPE, you need to compensate for it - increase the instructions in your prompt(s), and you may need to turn up temp a bit.

That's actually SUPER helpful. I never would have found this on my own. Thank you so much!

For something like KoboldCPP where even if you disable RoPE, Automatic RoPE Scaling enables on its' own I assume you'd reccomend context shift and flash attention are turned off?
Console message: Automatic RoPE Scaling: Using (scale:1.000, base:1776946.1).

Somehow there isn't gibberish like this? All I can do is hope when the context exceeds 8k it will continue to work.

Thank you very much again. I was genuinly trying for days. Before when I disabled RoPE it would just say context overflow error so I have no idea why it's running auto now.

UnspokenProtocol changed discussion status to closed

Sign up or log in to comment