A Comparative Study on Reasoning Patterns of OpenAI's o1 Model Paper • 2410.13639 • Published Oct 17 • 16
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch Paper • 2410.18693 • Published Oct 24 • 40
U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs Paper • 2412.03205 • Published 11 days ago • 14
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published 5 days ago • 55