Nickyang
/

FastCuRL-1.5B-Preview

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Nickyang commited on 2 days ago

Commit

d18c25e

·

verified ·

1 Parent(s): bb2ce63

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ library_name: transformers
 ## FastCuRL Overview
-We release **FastCuRL-1.5B-Preview**, a slow-thinking reasoning model that achieves 43.1% accuracy on the AIME 2024 benchmark! We adapt a novel curriculum-guided iterative lengthening reinforcement learning to the distilled 1.5B model and observe continuous performance improvement as training steps increase. To better reproduce our work and advance research progress, we open-source our code, model, and data.
 Code: https://github.com/nick7nlp/FastCuRL

 ## FastCuRL Overview
+We release **FastCuRL-1.5B-Preview**, a slow-thinking reasoning model that **outperforms** the previous SoTA *DeepScaleR-1.5B-Preview* with **50% training steps**! We adapt a novel curriculum-guided iterative lengthening reinforcement learning to the *DeepSeek-R1-Distill-Qwen-1.5B* and observe continuous performance improvement as training steps increase. To better reproduce our work and advance research progress, we open-source our code, model, and data.
 Code: https://github.com/nick7nlp/FastCuRL