Hao Peng's picture

2 14 7

Hao Peng

Wesleythu

·

h-peng17

AI & ML interests

None yet

Recent Activity

upvoted a paper about 4 hours ago

Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems

commented on a paper about 4 hours ago

Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems

liked a dataset about 18 hours ago

THU-KEG/IFBench

View all activity

Organizations

Wesleythu's activity

commented a paper about 4 hours ago

Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems

Paper • 2502.19328 • Published about 14 hours ago • 9 •

commented a paper 4 months ago

Constraint Back-translation Improves Complex Instruction Following of Large Language Models

Paper • 2410.24175 • Published Oct 31, 2024 • 18 •