Zikang Shan's picture

Zikang Shan

zkshan2002

·

AI & ML interests

Reinforcement Learning

Recent Activity

published a dataset 1 day ago

zkshan2002/simpleRL-reason-math_level3to5_data_processed_with_qwen_prompt

updated a dataset 1 day ago

zkshan2002/simpleRL-reason-math_level3to5_data_processed_with_qwen_prompt

published a model 6 days ago

RTO-RL/Llama3-8B-TDPO

View all activity

Organizations

models 1

zkshan2002/RewardModel-uf-llama3.2-1B-OpenRLHF

Updated Oct 24, 2024 • 3

datasets 2

zkshan2002/simpleRL-reason-math_level3to5_data_processed_with_qwen_prompt

Viewer • Updated 1 day ago • 8.52k

zkshan2002/hh-rlhf_preprocessed

Viewer • Updated 24 days ago • 46.1k • 38