SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training Paper • 2412.15649 • Published 14 days ago
Interleaved Speech-Text Language Models are Simple Streaming Text to Speech Synthesizers Paper • 2412.16102 • Published 14 days ago