Is this trained off the base or instruct model?

#1
by Downtown-Case - opened

I see it appears to be based on the base model in the sidebar, but sometimes that's not correct, so I want to be sure.

I'm mostly asking because Qwen 32B instruct is trained to use YaRN for >32K context, but that's not the case with the base model.

EVA-UNIT-01 org
edited Oct 23

This model was trained on top of the base model. Sidebar info is correct here. You can also see this in Axolotl config at the bottom of the card.

AuriAetherwiing changed discussion status to closed

Oh I missed that, thanks!

Sign up or log in to comment