Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models Paper • 2503.11224 • Published 4 days ago • 26
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published about 1 month ago • 145
Sparse Low-rank Adaptation of Pre-trained Language Models Paper • 2311.11696 • Published Nov 20, 2023 • 2
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization Paper • 2412.17739 • Published Dec 23, 2024 • 41