ktoprakucar/gte-Qwen2-1.5B-instruct-Q8-GPTQ

This is the 8-bit quantized version of Alibaba-NLP/gte-Qwen2-1.5B-instruct by following the example from the AutoGPTQ repository.

Safetensors

Model size

807M params

Tensor type

I32

FP16

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The HF Inference API does not support sentence-similarity models for transformers library.

Model tree for ktoprakucar/gte-Qwen2-1.5B-instruct-Q8-GPTQ

Base model

Quantized

(15)

this model