QuietSTAR?
#9
by
albatrossbirdie
- opened
I'm wondering if this is using the QuietSTAR system to generate thinking tokens or if this is just a baked in system prompt to force it to use CoT reasoning?
CoT is backed into the weights via RLHF or similar directional fine-tuning, obviously, sis.