FastCuRL-1.5B-Preview

FastCuRL Overview

We release FastCuRL-1.5B-Preview, a slow-thinking reasoning model that achieves 43.1% accuracy on the AIME 2024 benchmark! We adapt a novel curriculum-guided iterative lengthening reinforcement learning to the distilled 1.5B model and observe continuous performance improvement as training steps increase. To better reproduce our work and advance research progress, we open-source our code, model, and data.

Code: https://github.com/nick7nlp/FastCuRL

Key Results

We report Pass@1 accuracy averaged over 16 samples for each problem.

Model AIME 2024 MATH 500 AMC 2023 Minerva Math OlympiadBench Avg.
Qwen2.5-Math-7B-Instruct 13.3 79.8 50.6 34.6 40.7 43.8
rStar-Math-7B 26.7 78.4 47.5 - 47.1 -
Eurus-2-7B-PRIME 26.7 79.2 57.8 38.6 42.1 48.9
Qwen2.5-7B-SimpleRL 26.7 82.4 62.5 39.7 43.3 50.9
DeepSeek-R1-Distill-Qwen-1.5B 28.8 82.8 62.9 26.5 43.3 48.9
Still-1.5B 32.5 84.4 66.7 29.0 45.4 51.6
DeepScaleR-1.5B-Preview 43.1 87.8 73.6 30.2 50.0 57.0
FastCuRL-1.5B-Preview 43.1 88.0 74.2 31.6 50.4 57.5

Training Data

Following DeepScaleR, our training dataset consists of 40,315 unique problem-answer pairs compiled from:

  • AIME problems (1984-2023)
  • AMC problems (before 2023)
  • Omni-MATH dataset
  • Still dataset

Acknowledgements

Downloads last month
0
Safetensors
Model size
1.78B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for Nickyang/FastCuRL-1.5B-Preview

Finetuned
(171)
this model
Quantizations
1 model