Accelerated Preference Optimization for Large Language Model Alignment Paper • 2410.06293 • Published Oct 8 • 5
MARS: Unleashing the Power of Variance Reduction for Training Large Models Paper • 2411.10438 • Published Nov 15 • 13
MARS: Unleashing the Power of Variance Reduction for Training Large Models Paper • 2411.10438 • Published Nov 15 • 13
General Preference Modeling with Preference Representations for Aligning Language Models Paper • 2410.02197 • Published Oct 3 • 8
ProteinBench: A Holistic Evaluation of Protein Foundation Models Paper • 2409.06744 • Published Sep 10 • 7
view post Post 697 We've open-sourced the code and models for Self-Play Preference Optimization (SPPO)! 🚀🚀🚀🤗paper: Self-Play Preference Optimization for Language Model Alignment (2405.00675) ⭐ code: https://github.com/uclaml/SPPO🤗models: UCLA-AGI/sppo-6635fdd844f2b2e4a94d0b9a 🔥 3 3 + Reply
Self-Play Preference Optimization for Language Model Alignment Paper • 2405.00675 • Published May 1 • 25
Self-Play Preference Optimization for Language Model Alignment Paper • 2405.00675 • Published May 1 • 25
DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization Paper • 2403.13829 • Published Mar 7
Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs Paper • 2305.08359 • Published May 15, 2023
Risk Bounds of Accelerated SGD for Overparameterized Linear Regression Paper • 2311.14222 • Published Nov 23, 2023
How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression? Paper • 2310.08391 • Published Oct 12, 2023
Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits Paper • 2310.00968 • Published Oct 2, 2023
Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning Paper • 2310.01380 • Published Oct 2, 2023
Self-Play Preference Optimization for Language Model Alignment Paper • 2405.00675 • Published May 1 • 25
view post Post Check out the demo of SPIN-Diffusion made by @angelahzyuan at: UCLA-AGI/SPIN-Diffusion-demo-v1 45 replies · ❤️ 6 6 + Reply
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Paper • 2401.01335 • Published Jan 2 • 64
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation Paper • 2402.10210 • Published Feb 15 • 32