arxiv:2404.14367
Anikait Singh
Asap7772
AI & ML interests
Reinforcement Learning, Robotics
Organizations
Papers
2
models
None public yet
datasets
76
Asap7772/value_prm800k_disc90
Viewer
•
Updated
•
760k
Asap7772/value_prm800k_disc80
Viewer
•
Updated
•
760k
Asap7772/prm_1to3_math
Viewer
•
Updated
•
115k
Asap7772/prm_1to5_math
Viewer
•
Updated
•
272k
•
293
Asap7772/prm_prm800k_1to3
Viewer
•
Updated
•
248k
Asap7772/prm_prm800k
Viewer
•
Updated
•
764k
•
127
Asap7772/sft_math_1to3_pruned
Viewer
•
Updated
•
706k
Asap7772/sft_math_1to3
Viewer
•
Updated
•
348k
Asap7772/sft_prm800k_1to3
Viewer
•
Updated
•
168k
Asap7772/sft_math_pruned
Viewer
•
Updated
•
794k