SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization Paper • 2405.11582 • Published May 19, 2024 • 14
No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding Paper • 2405.08344 • Published May 14, 2024 • 12
Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting Paper • 2404.18911 • Published Apr 29, 2024 • 29
DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models Paper • 2403.00818 • Published Feb 26, 2024 • 15