Towards Fully-Automated Materials Discovery via Large-Scale Synthesis Dataset and Expert-Level LLM-as-a-Judge Paper • 2502.16457 • Published 17 days ago • 11
SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? Paper • 2502.12115 • Published 23 days ago • 43
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published Jan 13 • 92
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models Paper • 2404.02575 • Published Apr 3, 2024 • 50
Orca 2: Teaching Small Language Models How to Reason Paper • 2311.11045 • Published Nov 18, 2023 • 73
BitNet: Scaling 1-bit Transformers for Large Language Models Paper • 2310.11453 • Published Oct 17, 2023 • 97
Enable Language Models to Implicitly Learn Self-Improvement From Data Paper • 2310.00898 • Published Oct 2, 2023 • 23
Large Language Models Cannot Self-Correct Reasoning Yet Paper • 2310.01798 • Published Oct 3, 2023 • 35
Teach LLMs to Personalize -- An Approach inspired by Writing Education Paper • 2308.07968 • Published Aug 15, 2023 • 26
PolyLM: An Open Source Polyglot Large Language Model Paper • 2307.06018 • Published Jul 12, 2023 • 26