Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

zyyang's picture

2

zyyang

zy0yang

AI & ML interests

SFT & RLHF

Organizations

Collections 5

Alignment-DPO-line

sDPO: Don't Use Your Data All at Once

Paper • 2403.19270 • Published Mar 28, 2024 • 40
Advancing LLM Reasoning Generalists with Preference Trees

Paper • 2404.02078 • Published Apr 2, 2024 • 44
Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 82
mDPO: Conditional Preference Optimization for Multimodal Large Language Models

Paper • 2406.11839 • Published Jun 17, 2024 • 37

Multi-Head Mixture-of-Experts

Paper • 2404.15045 • Published Apr 23, 2024 • 59

models

None public yet

datasets

None public yet

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs