Context size in the README does not seem to be correct.

#2
by mahiatlinux - opened

Hi there. Very nice line of models!

The README mentions that this model has 128K context, but taking a look at the config file, it only appears to have 32K context. Please correct me if I am wrong on this.

README
image.png

config.json
image.png

Qwen org
β€’
edited Oct 10

Hi there!

Thanks for the report. In this case, the readme/modelcard is accurate.

Please also note this comment: https://huggingface.co/Qwen/Qwen2.5-32B/discussions/1#66f11ee85a65f26be709470b

For base models, the context length was evaluated with ppl. Even though the model was trained with 32k, the ppl did not degrade for 132k. However, it does not mean that the model can generate sequences of that length.

Thank you very much for the clarification!

mahiatlinux changed discussion status to closed

Sign up or log in to comment