Identity-Preserving Text-to-Video Generation by Frequency Decomposition Paper • 2411.17440 • Published Nov 26, 2024 • 35
FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity Paper • 2411.15411 • Published Nov 23, 2024 • 7
VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models Paper • 2411.17451 • Published Nov 26, 2024 • 10
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models Paper • 2410.10139 • Published Oct 14, 2024 • 51
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions Paper • 2409.18042 • Published Sep 26, 2024 • 37
Video Diffusion Alignment via Reward Gradients Paper • 2407.08737 • Published Jul 11, 2024 • 48
MagicTime Collection MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators • 4 items • Updated Nov 29, 2024 • 13
ChronoMagic-Bench Collection ChronoMagic-Bench : A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation • 6 items • Updated Nov 29, 2024 • 10
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation Paper • 2406.18522 • Published Jun 26, 2024 • 20
ReVideo: Remake a Video with Motion and Content Control Paper • 2405.13865 • Published May 22, 2024 • 23
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators Paper • 2404.05014 • Published Apr 7, 2024 • 33