how many resources were used for quantizing this model?
1
#4 opened 5 months ago
by
fengyang1995
Unable to use fp8 kv cache with neuralmagic quants on ampere
#3 opened 6 months ago
by
ndurkee
Storage format differs from other w4a16 models
2
#2 opened 6 months ago
by
timdettmers
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1638462736111-noauth.png)
weights does not exist when trying to deploy in sagemaker endpoint
1
#1 opened 6 months ago
by
LorenzoCevolaniAXA
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/dWy8pkzRWllMotX8WHVOf.jpeg)