Is this the standard GPTQ quantization?

by molereddy - opened 24 days ago

24 days ago

Wondering because unlike other quantized models like in HuggingQuants this model ID only mentions the precision and not the quant method.

mgoin

Neural Magic org 21 days ago

This model is using the compressed-tensors format for the checkpoint format. GPTQ was used as an algorithm to produce the weights, but we use compressed-tensors to have one library for saving/loading models quantized in various formats like FP8, INT8, W8A8, W4A16

mgoin changed discussion status to closed 21 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment