HUNYUANPROVER: A Scalable Data Synthesis Framework and Guided Tree Search for Automated Theorem Proving Paper • 2412.20735 • Published 4 days ago • 5
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis Paper • 2412.19723 • Published 7 days ago • 56
Efficiently Serving LLM Reasoning Programs with Certaindex Paper • 2412.20993 • Published 4 days ago • 27
Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization Paper • 2412.18525 • Published 10 days ago • 57
Bringing Objects to Life: 4D generation from 3D objects Paper • 2412.20422 • Published 5 days ago • 31
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization Paper • 2412.21037 • Published 4 days ago • 20
Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging Paper • 2412.19512 • Published 7 days ago • 8
VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models Paper • 2412.19645 • Published 7 days ago • 13
From Elements to Design: A Layered Approach for Automatic Graphic Design Composition Paper • 2412.19712 • Published 7 days ago • 14
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models Paper • 2412.18605 • Published 9 days ago • 17
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment Paper • 2412.19326 • Published 7 days ago • 17
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey Paper • 2412.18619 • Published 18 days ago • 44
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning Paper • 2412.03248 • Published 30 days ago • 26
A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression Paper • 2412.17483 • Published 11 days ago • 29
How "Real" is Your Real-Time Simultaneous Speech-to-Text Translation System? Paper • 2412.18495 • Published 10 days ago • 8