Rakuto/Llama3-ChatQA-1.5-8B-GPTQ-4bit

Model Details

4bit GPTQ quantized variant of nvidia/Llama3-ChatQA-1.5-8B.

The use of this model is governed by the META LLAMA 3 COMMUNITY LICENSE AGREEMENT

Safetensors

Model size

1.99B params

Tensor type

FP16

I32

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

Base model

Quantized

(20)

this model