lrl-modelcloud commited on
Commit
062bc45
1 Parent(s): ab84b16

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -1
README.md CHANGED
@@ -23,4 +23,22 @@ This model has been quantized using [GPTQModel](https://github.com/ModelCloud/GP
23
  - **quant_method**: "gptq"
24
  - **checkpoint_format**: "gptq"
25
  - **meta**:
26
- - **quantizer**: "gptqmodel:0.9.9-dev0"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  - **quant_method**: "gptq"
24
  - **checkpoint_format**: "gptq"
25
  - **meta**:
26
+ - **quantizer**: "gptqmodel:0.9.9-dev0"
27
+
28
+ **Here is an example:**
29
+ ```python
30
+ from transformers import AutoTokenizer
31
+ from gptqmodel import GPTQModel
32
+
33
+ model_name = "ModelCloud/Meta-Llama-3.1-8B-Instruct-gptq-4bit"
34
+
35
+ prompt = [{"role": "user", "content": "I am in Shanghai, preparing to visit the natural history museum. Can you tell me the best way to"}]
36
+
37
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
38
+
39
+ model = GPTQModel.from_quantized(model_name)
40
+
41
+ inputs = tokenizer.apply_chat_template(prompt, tokenize=False, add_generation_prompt=True)
42
+ outputs = model.generate(prompts=inputs, temperature=0.95, max_length=128)
43
+ print(outputs[0].outputs[0].text)
44
+ ```