Is this the standard GPTQ quantization?
#5
by
molereddy
- opened
Wondering because unlike other quantized models like in HuggingQuants this model ID only mentions the precision and not the quant method.
This model is using the compressed-tensors format for the checkpoint format. GPTQ was used as an algorithm to produce the weights, but we use compressed-tensors to have one library for saving/loading models quantized in various formats like FP8, INT8, W8A8, W4A16
mgoin
changed discussion status to
closed