This is the 8-bit quantized version of Alibaba-NLP/gte-Qwen2-1.5B-instruct by following the example from the AutoGPTQ repository.

Downloads last month
93
Safetensors
Model size
807M params
Tensor type
I32
·
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support sentence-similarity models for transformers library.

Model tree for ktoprakucar/gte-Qwen2-1.5B-instruct-Q8-GPTQ

Quantized
(15)
this model