RoPE scaling guidance

#4
by RonanMcGovern - opened

Congrats on the model.

Is there guidance for using RoPE scaling beyond 16k?

How is performance at 32k or more? Thanks

Sign up or log in to comment