Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers Paper • 2406.16747 • Published 9 days ago • 16
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack Paper • 2406.10149 • Published 19 days ago • 47
Mixture-of-Agents Enhances Large Language Model Capabilities Paper • 2406.04692 • Published 26 days ago • 50
MedFuzz: Exploring the Robustness of Large Language Models in Medical Question Answering Paper • 2406.06573 • Published 30 days ago • 8
UNA - Cybertron 7b [Uniform Neural Alignment] Collection Another rockstar model, was born as a leader. Tamed with UNA, DPO, SFT. • 4 items • Updated Dec 14, 2023 • 4