SparseLLMs

community

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

ZhengyanZhang authored a paper 20 days ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Raincleared new activity 2 months ago

SparseLLM/prosparse-llama-2-7b:Model not running on CPU, due to flash_attn package requirement.

Raincleared new activity 3 months ago

SparseLLM/ReluLLaMA-7B:Adding `safetensors` variant of this model

View all activity

SparseLLM's activity

ZhengyanZhang

authored a paper 20 days ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published 22 days ago • 141

Raincleared

in SparseLLM/prosparse-llama-2-7b 2 months ago

Model not running on CPU, due to flash_attn package requirement.

#8 opened 3 months ago by

Akash1003

Raincleared

in SparseLLM/ReluLLaMA-7B 3 months ago

Adding `safetensors` variant of this model

#3 opened 3 months ago by

SFconvertbot

Raincleared

in SparseLLM/sparsing-law-0.1b-relu 3 months ago

Adding `safetensors` variant of this model

#1 opened 3 months ago by

SFconvertbot

demerzel-iv

authored a paper 4 months ago

Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

Paper • 2411.02335 • Published Nov 4, 2024 • 11

demerzel-iv

updated 8 models 4 months ago

Raincleared

authored a paper 4 months ago

Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

Paper • 2411.02335 • Published Nov 4, 2024 • 11

demerzel-iv

updated a model 4 months ago

SparseLLM/sparsing-law-0.1b-silu

Text Generation • Updated Nov 5, 2024 • 16

Raincleared

updated a model 4 months ago

SparseLLM/sparsing-law-0.1b-relu

Text Generation • Updated Dec 12, 2024 • 38 • 2

demerzel-iv

updated a model 4 months ago

SparseLLM/sparsing-law-0.1b-relu

Text Generation • Updated Dec 12, 2024 • 38 • 2

Raincleared

authored a paper 6 months ago

Configurable Foundation Models: Building LLMs from a Modular Perspective

Paper • 2409.02877 • Published Sep 4, 2024 • 29

ZhengyanZhang

authored a paper 6 months ago

Configurable Foundation Models: Building LLMs from a Modular Perspective

Paper • 2409.02877 • Published Sep 4, 2024 • 29

ZhengyanZhang

authored a paper 9 months ago

Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters

Paper • 2406.05955 • Published Jun 10, 2024 • 27

AI & ML interests

Recent Activity

Team members 7

SparseLLM's activity

Model not running on CPU, due to flash_attn package requirement.

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model