SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints Paper • 2412.07760 • Published 5 days ago • 43
LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment Paper • 2412.04814 • Published 9 days ago • 39
Mind the Time: Temporally-Controlled Multi-Event Video Generation Paper • 2412.05263 • Published 9 days ago • 9
VisionZip: Longer is Better but Not Necessary in Vision Language Models Paper • 2412.04467 • Published 10 days ago • 97
VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation Paper • 2412.02259 • Published 12 days ago • 55
STIV: Scalable Text and Image Conditioned Video Generation Paper • 2412.07730 • Published 5 days ago • 63
Towards Language Models That Can See: Computer Vision Through the LENS of Natural Language Paper • 2306.16410 • Published Jun 28, 2023 • 28