Quantization support

#1
by sbrzz - opened

Is there any quantization_config (4bit, 8bit) supported for inference?

Sign up or log in to comment