Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Zikang Shan
zkshan2002
Follow
AI & ML interests
Reinforcement Learning
Recent Activity
published
a model
5 days ago
zkshan2002/instruct-rm1B
published
a model
5 days ago
zkshan2002/instruct-dpo0.01
updated
a model
5 days ago
zkshan2002/instruct-rm1B
View all activity
Organizations
None yet
models
8
Sort: Recently updated
zkshan2002/instruct-rm1B
Updated
5 days ago
•
34
zkshan2002/instruct-dpo0.01
Updated
5 days ago
•
18
zkshan2002/ppo-0.44
Updated
Dec 1, 2024
•
2
zkshan2002/r1B-sft_tokenizer
Updated
Nov 18, 2024
•
404
zkshan2002/RewardModel-uf-llama3.2-1B-OpenRLHF
Updated
Oct 24, 2024
•
3
zkshan2002/DPO-uf-llama3-8B-OpenRLHF
Updated
Oct 14, 2024
•
222
zkshan2002/PPO-uf-llama3-8B-OpenRLHF
Updated
Oct 11, 2024
•
6
zkshan2002/RewardModel-uf-llama3-8B-OpenRLHF
Updated
Oct 11, 2024
•
431
datasets
None public yet