wzhouad (Wenxuan Zhou)

Collections 1

Models and datasets in paper "WPO: Enhancing RLHF with Weighted Preference Optimization".

Papers 5

models 8

datasets 4

wzhouad/llama3-ultrafeedback-hybrid

Viewer • Updated Aug 22, 2024 • 64.5k • 86 • 2

wzhouad/llama3-ultrafeedback-hybrid-v2

Viewer • Updated Aug 22, 2024 • 64.5k • 83 • 5

wzhouad/zephyr-ultrafeedback-hybrid

Viewer • Updated Aug 21, 2024 • 64.7k • 82 • 2

wzhouad/gemma-2-ultrafeedback-hybrid

Viewer • Updated Aug 21, 2024 • 61.6k • 90 • 7

Wenxuan Zhou

AI & ML interests

Organizations

Collections 1

wzhouad/Llama3-Instruct-8B-WPO-FP

wzhouad/Llama3-Instruct-8B-WPO-HB

wzhouad/zephyr-7B-WPO-FP

wzhouad/zephyr-7B-WPO-HB

Papers 5

models 8

wzhouad/Llama3-Instruct-8B-WPO-HB-v2

wzhouad/Llama3-Instruct-8B-WPO-HB

wzhouad/zephyr-7B-WPO-HB

wzhouad/gemma-2-9b-it-WPO-HB

wzhouad/gemma-2-9b-it-WPO-FP

wzhouad/zephyr-7B-WPO-FP

wzhouad/Llama3-Instruct-8B-WPO-FP

wzhouad/prix-lm

datasets 4

wzhouad/llama3-ultrafeedback-hybrid

wzhouad/llama3-ultrafeedback-hybrid-v2

wzhouad/zephyr-ultrafeedback-hybrid

wzhouad/gemma-2-ultrafeedback-hybrid

Wenxuan Zhou

AI & ML interests

Organizations

Collections 1

Papers 5

models 8 Sort: Recently updated

datasets 4 Sort: Recently updated

models 8

datasets 4