Falcon Mamba: The First Competitive Attention-free 7B Language Model Paper • 2410.05355 • Published Oct 7 • 29
GliDe with a CaPE: A Low-Hassle Method to Accelerate Speculative Decoding Paper • 2402.02082 • Published Feb 3 • 1
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting Paper • 2309.04269 • Published Sep 8, 2023 • 32
YaRN: Efficient Context Window Extension of Large Language Models Paper • 2309.00071 • Published Aug 31, 2023 • 65