Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -23,4 +23,22 @@ This model has been quantized using [GPTQModel](https://github.com/ModelCloud/GP
 - **quant_method**: "gptq"
 - **checkpoint_format**: "gptq"
 - **meta**：
-  - **quantizer**: "gptqmodel:0.9.9-dev0"

 - **quant_method**: "gptq"
 - **checkpoint_format**: "gptq"
 - **meta**：
+  - **quantizer**: "gptqmodel:0.9.9-dev0"
+**Here is an example:**
+```python
+from transformers import AutoTokenizer
+from gptqmodel import GPTQModel
+model_name = "ModelCloud/Meta-Llama-3.1-8B-Instruct-gptq-4bit"
+prompt = [{"role": "user", "content": "I am in Shanghai, preparing to visit the natural history museum. Can you tell me the best way to"}]
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = GPTQModel.from_quantized(model_name)
+inputs = tokenizer.apply_chat_template(prompt, tokenize=False, add_generation_prompt=True)
+outputs = model.generate(prompts=inputs, temperature=0.95, max_length=128)
+print(outputs[0].outputs[0].text)
+```