Is this the standard GPTQ quantization?

#5
by molereddy - opened

Wondering because unlike other quantized models like in HuggingQuants this model ID only mentions the precision and not the quant method.

Neural Magic org

This model is using the compressed-tensors format for the checkpoint format. GPTQ was used as an algorithm to produce the weights, but we use compressed-tensors to have one library for saving/loading models quantized in various formats like FP8, INT8, W8A8, W4A16

mgoin changed discussion status to closed

Sign up or log in to comment