Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models Paper • 2501.01423 • Published about 16 hours ago • 20
MLLM-as-a-Judge for Image Safety without Human Labeling Paper • 2501.00192 • Published 3 days ago • 9
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings Paper • 2501.01257 • Published about 22 hours ago • 23
ProgCo: Program Helps Self-Correction of Large Language Models Paper • 2501.01264 • Published about 21 hours ago • 16
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control Paper • 2501.01427 • Published about 16 hours ago • 26
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM Paper • 2501.00599 • Published 3 days ago • 23
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published 1 day ago • 29
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis Paper • 2412.19723 • Published 7 days ago • 60
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs Paper • 2412.21187 • Published 4 days ago • 23
PERSE: Personalized 3D Generative Avatars from A Single Portrait Paper • 2412.21206 • Published 4 days ago • 14
OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System Paper • 2412.20005 • Published 6 days ago • 13
Training Software Engineering Agents and Verifiers with SWE-Gym Paper • 2412.21139 • Published 4 days ago • 15
Bringing Objects to Life: 4D generation from 3D objects Paper • 2412.20422 • Published 5 days ago • 31
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation Paper • 2412.21199 • Published 4 days ago • 9
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization Paper • 2412.21037 • Published 4 days ago • 20
Efficiently Serving LLM Reasoning Programs with Certaindex Paper • 2412.20993 • Published 4 days ago • 28
Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization Paper • 2412.18525 • Published 10 days ago • 58