pgarbacki
's Collections
reasoning
updated
Training Large Language Models to Reason in a Continuous Latent Space
Paper
•
2412.06769
•
Published
•
78
Scaling LLM Test-Time Compute Optimally can be More Effective than
Scaling Model Parameters
Paper
•
2408.03314
•
Published
•
54
ICAL: Continual Learning of Multimodal Agents by Transforming
Trajectories into Actionable Insights
Paper
•
2406.14596
•
Published
•
5
A Comprehensive Survey of LLM Alignment Techniques: RLHF, RLAIF, PPO,
DPO and More
Paper
•
2407.16216
•
Published
Thinking LLMs: General Instruction Following with Thought Generation
Paper
•
2410.10630
•
Published
•
18
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep
Thinking
Paper
•
2501.04519
•
Published
•
255
TextGrad: Automatic "Differentiation" via Text
Paper
•
2406.07496
•
Published
•
30
Accelerating Feedforward Computation via Parallel Nonlinear Equation
Solving
Paper
•
2002.03629
•
Published
LIMO: Less is More for Reasoning
Paper
•
2502.03387
•
Published
•
49