Do you think it can be superhot?

#1
by FenixInDarkSolo - opened

I am just a user, but I am also curious about Llama-2's structure.
Do you think it can be trained to be a superhot model?

Hi, as far as I know, you can't merge anything with kaiokendev's superhot model except LLaMA itself, which excludes everything, even LLaMA 2.

If you want extended context, you can try koboldcpp 1.36 with the following params: --ropeconfig 0.5 10000 --contextsize 8192 (I can't test it right now, though, you might have to swap 0.5 with 0.25.). But remember that LLaMA already has the context window of 4096 tokens, not 2048 as it was with the original LLaMA: --ropeconfig 1 10000 --contextsize 4096 to utilize it with koboldcpp 1.36.

If you want superhot's spice, then we'd have to wait for kaiokendev to train another LoRA, over LLaMA 2.

Sign up or log in to comment