OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models Paper • 2307.03084 • Published Jul 5, 2023 • 1
Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models Paper • 2403.08281 • Published Mar 13, 2024
Intuitive Fine-Tuning: Towards Unifying SFT and RLHF into a Single Process Paper • 2405.11870 • Published May 20, 2024
UltraMedical: Building Specialized Generalists in Biomedicine Paper • 2406.03949 • Published Jun 6, 2024
Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding Paper • 2406.12295 • Published Jun 18, 2024
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization Paper • 2412.17739 • Published Dec 23, 2024 • 41
Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models Paper • 2503.11224 • Published 4 days ago • 26
Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models Paper • 2503.11224 • Published 4 days ago • 26
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published about 1 month ago • 145
Sparse Low-rank Adaptation of Pre-trained Language Models Paper • 2311.11696 • Published Nov 20, 2023 • 2
Sparse Low-rank Adaptation of Pre-trained Language Models Paper • 2311.11696 • Published Nov 20, 2023 • 2
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization Paper • 2412.17739 • Published Dec 23, 2024 • 41