QuietSTAR?

#9
by albatrossbirdie - opened

I'm wondering if this is using the QuietSTAR system to generate thinking tokens or if this is just a baked in system prompt to force it to use CoT reasoning?

CoT is backed into the weights via RLHF or similar directional fine-tuning, obviously, sis.

Sign up or log in to comment