gemma-2-9B-it-iq1_m

This is a quantized version of the Gemma2 9B instruct model using the IQ1_M quantization method.

Model Details

Original Model: Gemma2-9B-it
Quantization Method: IQ1_M
Precision: 1-bit
iMatrix: From bartowski. You can find the file in the following repo; gemma-2-9b-it-gguf repo

You can use it directly with llama.cpp

GGUF

Model size

9.24B params

Architecture

gemma2

1-bit

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.