Is this trained off the base or instruct model?

by Downtown-Case - opened Oct 23

Oct 23

•

I see it appears to be based on the base model in the sidebar, but sometimes that's not correct, so I want to be sure.

I'm mostly asking because Qwen 32B instruct is trained to use YaRN for >32K context, but that's not the case with the base model.

EVA-UNIT-01 org Oct 23

•

This model was trained on top of the base model. Sidebar info is correct here. You can also see this in Axolotl config at the bottom of the card.

AuriAetherwiing changed discussion status to closed Oct 23

Oct 23

Oh I missed that, thanks!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment