Sparsing Law: Towards Large Language Models with Greater Activation Sparsity Paper • 2411.02335 • Published Nov 4 • 11
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity Paper • 2411.02335 • Published Nov 4 • 11
Configurable Foundation Models: Building LLMs from a Modular Perspective Paper • 2409.02877 • Published Sep 4 • 27
Configurable Foundation Models: Building LLMs from a Modular Perspective Paper • 2409.02877 • Published Sep 4 • 27
Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters Paper • 2406.05955 • Published Jun 10 • 22
Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters Paper • 2406.05955 • Published Jun 10 • 22
ReLU$^2$ Wins: Discovering Efficient Activation Functions for Sparse LLMs Paper • 2402.03804 • Published Feb 6 • 2