MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data Paper • 2406.18790 • Published Jun 26 • 33
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition Paper • 2210.13352 • Published Oct 24, 2022 • 3
Multi-Span Acoustic Modelling using Raw Waveform Signals Paper • 1906.11047 • Published Jun 21, 2019 • 1
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module Paper • 2311.05556 • Published Nov 9, 2023 • 82
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module Paper • 2311.05556 • Published Nov 9, 2023 • 82
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling Paper • 2311.00430 • Published Nov 1, 2023 • 57
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale Paper • 2111.09296 • Published Nov 17, 2021 • 2