-
mDPO: Conditional Preference Optimization for Multimodal Large Language Models
Paper • 2406.11839 • Published • 37 -
Pandora: Towards General World Model with Natural Language Actions and Video States
Paper • 2406.09455 • Published • 14 -
WPO: Enhancing RLHF with Weighted Preference Optimization
Paper • 2406.11827 • Published • 14 -
In-Context Editing: Learning Knowledge from Self-Induced Distributions
Paper • 2406.11194 • Published • 15
Collections
Discover the best community collections!
Collections including paper arxiv:2406.11194
-
Iterative Reasoning Preference Optimization
Paper • 2404.19733 • Published • 46 -
Better & Faster Large Language Models via Multi-token Prediction
Paper • 2404.19737 • Published • 73 -
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 61 -
KAN: Kolmogorov-Arnold Networks
Paper • 2404.19756 • Published • 108
-
PockEngine: Sparse and Efficient Fine-tuning in a Pocket
Paper • 2310.17752 • Published • 12 -
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Paper • 2311.03285 • Published • 28 -
Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization
Paper • 2311.06243 • Published • 17 -
Fine-tuning Language Models for Factuality
Paper • 2311.08401 • Published • 28