UCLA Artificial General Intelligence Lab

university

thughost

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

angelahzyuan authored a paper about 1 month ago

Accelerated Preference Optimization for Large Language Model Alignment

angelahzyuan authored a paper about 1 month ago

MARS: Unleashing the Power of Variance Reduction for Training Large Models

thughost authored a paper about 1 month ago

MARS: Unleashing the Power of Variance Reduction for Training Large Models

View all activity

UCLA-AGI's activity

angelahzyuan

authored 2 papers about 1 month ago

Accelerated Preference Optimization for Large Language Model Alignment

Paper • 2410.06293 • Published Oct 8 • 5

MARS: Unleashing the Power of Variance Reduction for Training Large Models

Paper • 2411.10438 • Published Nov 15 • 13

thughost

authored a paper about 1 month ago

MARS: Unleashing the Power of Variance Reduction for Training Large Models

Paper • 2411.10438 • Published Nov 15 • 13

thughost

authored a paper 2 months ago

DPLM-2: A Multimodal Diffusion Protein Language Model

Paper • 2410.13782 • Published Oct 17 • 19

thughost

authored 3 papers 3 months ago

General Preference Modeling with Preference Representations for Aligning Language Models

Paper • 2410.02197 • Published Oct 3 • 8

LLaVA-Critic: Learning to Evaluate Multimodal Models

Paper • 2410.02712 • Published Oct 3 • 35

ProteinBench: A Holistic Evaluation of Protein Foundation Models

Paper • 2409.06744 • Published Sep 10 • 7

thughost

posted an update 6 months ago

Post

697

We've open-sourced the code and models for Self-Play Preference Optimization (SPPO)! 🚀🚀🚀
🤗paper: Self-Play Preference Optimization for Language Model Alignment (2405.00675)
⭐ code: https://github.com/uclaml/SPPO
🤗models: UCLA-AGI/sppo-6635fdd844f2b2e4a94d0b9a

angelahzyuan

authored a paper 8 months ago

Self-Play Preference Optimization for Language Model Alignment

Paper • 2405.00675 • Published May 1 • 25

Jerry46

authored a paper 8 months ago

Self-Play Preference Optimization for Language Model Alignment

Paper • 2405.00675 • Published May 1 • 25

thughost

authored 7 papers 8 months ago

DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization

Paper • 2403.13829 • Published Mar 7

Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs

Paper • 2305.08359 • Published May 15, 2023

Risk Bounds of Accelerated SGD for Overparameterized Linear Regression

Paper • 2311.14222 • Published Nov 23, 2023

How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?

Paper • 2310.08391 • Published Oct 12, 2023

Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits

Paper • 2310.00968 • Published Oct 2, 2023

Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning

Paper • 2310.01380 • Published Oct 2, 2023

Self-Play Preference Optimization for Language Model Alignment

Paper • 2405.00675 • Published May 1 • 25

thughost

posted an update 10 months ago

Post

Check out the demo of SPIN-Diffusion made by @angelahzyuan at: UCLA-AGI/SPIN-Diffusion-demo-v1

45 replies

angelahzyuan

authored 2 papers 10 months ago

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2 • 64

Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation

Paper • 2402.10210 • Published Feb 15 • 32

AI & ML interests

Recent Activity

Team members 4

UCLA-AGI's activity