ImportError: This modeling file requires the following packages that were not found in your environment: flash_attn. Run `pip install flash_attn`
#14 opened about 13 hours ago
by
kang1
How much memory is needed if you make the 128k context length
1
#13 opened 25 days ago
by
ggbondcxk
Implement MLA inference optimizations to DeepseekV2Attention
#12 opened about 1 month ago
by
sy-chen
Join LMSYS Chatbot Arena?
1
#11 opened about 1 month ago
by
Light4Bear
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1666277867924-noauth.png)
Can you provide a sample code for training with DeepSpeed ZeRO3?
2
#10 opened about 2 months ago
by
SupercarryNg
Ollama support
1
#9 opened about 2 months ago
by
Dao3
MoE offloading strategy?
2
#8 opened about 2 months ago
by
Minami-su
![](https://cdn-avatars.huggingface.co/v1/production/uploads/62d7f90b102d144db4b4245b/qR4GHvVyWW9KR83ItUMtr.jpeg)
Update README.md
#7 opened about 2 months ago
by
VanishingPsychopath
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/ur5iNmvDYjYApTGue1TE_.png)
kv cache
2
#6 opened about 2 months ago
by
FrankWu
function/tool calling support
7
#5 opened about 2 months ago
by
kaijietti
fail to run the example
8
#4 opened about 2 months ago
by
Leymore
![](https://cdn-avatars.huggingface.co/v1/production/uploads/63f04fb94a788ed1dd89daf4/5XNPePED6fTMH0SAM2lhj.jpeg)
GPTQ plz
10
#3 opened about 2 months ago
by
xuchen123
vllm support
6
#2 opened about 2 months ago
by
Sihangli
![](https://cdn-avatars.huggingface.co/v1/production/uploads/635913a68f38318bfabc90ea/29M50hPy2nMwMjMElAEgA.jpeg)
llama.cpp support
5
#1 opened about 2 months ago
by
cpumaxx