OxxoCodes/Meta-Llama-3-8B-Instruct-GPTQ
Built with Meta Llama 3
Meta Llama 3 is licensed under the Meta Llama 3 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.
Model Description
This is a 4-bit GPTQ quantized version of meta-llama/Meta-Llama-3-8B-Instruct.
This model was quantized using the following quantization config:
quantize_config = BaseQuantizeConfig(
bits=4,
group_size=128,
desc_act=False,
damp_percent=0.1,
)
To use this model, you need to install AutoGPTQ. For detailed installation instructions, please refer to the AutoGPTQ GitHub repository.
Example Usage
from auto_gptq import AutoGPTQForCausalLM
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct")
model = AutoGPTQForCausalLM.from_quantized("OxxoCodes/Meta-Llama-3-8B-Instruct-GPTQ")
output = model.generate(**tokenizer("The capitol of France is", return_tensors="pt").to(model.device))[0]
print(tokenizer.decode(output))
- Downloads last month
- 10
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.