This is the 8-bit quantized version of Alibaba-NLP/gte-Qwen2-1.5B-instruct by following the example from the AutoGPTQ repository.
- Downloads last month
- 35
Inference API (serverless) does not yet support transformers models for this pipeline type.
Model tree for ktoprakucar/gte-Qwen2-1.5B-instruct-Q8-GPTQ
Base model
Alibaba-NLP/gte-Qwen2-1.5B-instruct