Scaling Transformers for Low-Bitrate High-Quality Speech Coding Paper • 2411.19842 • Published 23 days ago • 10
FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion Paper • 2411.18552 • Published 25 days ago • 17
AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis Paper • 2406.08920 • Published Jun 13 • 7
FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary Segmentation Paper • 2409.03525 • Published Sep 5 • 11
Efficient Audio Captioning with Encoder-Level Knowledge Distillation Paper • 2407.14329 • Published Jul 19 • 4
SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound Paper • 2405.00233 • Published Apr 30 • 13
OmniCount: Multi-label Object Counting with Semantic-Geometric Priors Paper • 2403.05435 • Published Mar 8 • 1
Actor-agnostic Multi-label Action Recognition with Multi-modal Query Paper • 2307.10763 • Published Jul 20, 2023