Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
xinlai
/
DeepSeekMath-RL-Step-DPO
like
1
Text Generation
Transformers
Safetensors
llama
conversational
text-generation-inference
Inference Endpoints
arxiv:
2406.18629
License:
apache-2.0
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
DeepSeekMath-RL-Step-DPO
Commit History
Update README.md
f8c9733
verified
xinlai
commited on
11 days ago
Update README.md
0134ded
verified
xinlai
commited on
11 days ago
upload model
6d4f1cf
xinlai
commited on
13 days ago
initial commit
9dae8ba
verified
xinlai
commited on
13 days ago