VidTwin: Video VAE with Decoupled Structure and Dynamics Paper • 2412.17726 • Published 10 days ago • 8
From Elements to Design: A Layered Approach for Automatic Graphic Design Composition Paper • 2412.19712 • Published 6 days ago • 14
Compositional 3D-aware Video Generation with LLM Director Paper • 2409.00558 • Published Aug 31, 2024 • 14
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models Paper • 2403.03100 • Published Mar 5, 2024 • 34
MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models Paper • 2310.11954 • Published Oct 18, 2023 • 25
UniAudio: An Audio Foundation Model Toward Universal Audio Generation Paper • 2310.00704 • Published Oct 1, 2023 • 21
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers Paper • 2309.08532 • Published Sep 15, 2023 • 53
PromptTTS 2: Describing and Generating Voices with Text Prompt Paper • 2309.02285 • Published Sep 5, 2023 • 11
Pre-Trained Large Language Models for Industrial Control Paper • 2308.03028 • Published Aug 6, 2023 • 6
GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework Paper • 2305.10841 • Published May 18, 2023 • 2