how many resources were used for quantizing this model?
1
#4 opened 3 months ago
by
fengyang1995
Unable to use fp8 kv cache with neuralmagic quants on ampere
#3 opened 3 months ago
by
ndurkee
Storage format differs from other w4a16 models
2
#2 opened 3 months ago
by
timdettmers
weights does not exist when trying to deploy in sagemaker endpoint
1
#1 opened 3 months ago
by
LorenzoCevolaniAXA