Context size in the README does not seem to be correct.
#2
by
mahiatlinux
- opened
Hi there!
Thanks for the report. In this case, the readme/modelcard is accurate.
Please also note this comment: https://huggingface.co/Qwen/Qwen2.5-32B/discussions/1#66f11ee85a65f26be709470b
For base models, the context length was evaluated with ppl. Even though the model was trained with 32k, the ppl did not degrade for 132k. However, it does not mean that the model can generate sequences of that length.
Thank you very much for the clarification!
mahiatlinux
changed discussion status to
closed