Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets Paper • 2410.01779 • Published Oct 2, 2024 • 2
Training Large Language Models to Reason in a Continuous Latent Space Paper • 2412.06769 • Published 25 days ago • 64
Cautious Optimizers: Improving Training with One Line of Code Paper • 2411.16085 • Published Nov 25, 2024 • 15