OPEA/QwQ-32B-int4-AutoRound-gptq-sym

Model Details

This model is an int4 model with group_size 128 and symmetric quantization of Qwen/QwQ-32B generated by intel/auto-round algorithm.

How To Use

INT4 Inference(CPU/HPU/CUDA)

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "OPEA/QwQ-32B-int4-AutoRound-gptq-sym"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompts = [
    "9.11和9.8哪个数字大",
    "如果你是人，你最想做什么“",
    "How many e in word deepseek",
    "There are ten birds in a tree. A hunter shoots one. How many are left in the tree?",
]

texts = []
for prompt in prompts:
    messages = [
        {"role": "user", "content": prompt}
    ]
    text = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )
    texts.append(text)
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, padding_side="left")

outputs  = model.generate(
    input_ids=inputs["input_ids"].to(model.device),
    attention_mask=inputs["attention_mask"].to(model.device),
    do_sample=False,  ## change this to follow official usage
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(inputs["input_ids"], outputs)
]

decoded_outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)

for i, prompt in enumerate(prompts):
    input_id = inputs
    print(f"Prompt: {prompt}")
    print(f"Generated: {decoded_outputs[i]}")
    print("-" * 50)

"""
Prompt: 9.11和9.8哪个数字大
Generated: 嗯，用户问的是9.11和9.8哪个数字大。首先，我需要确认这两个数字的比较方式。通常比较小数的时候，先看整数部分，如果整数部分不同，直接比较整数部分即可。如果整数部分相同，再比较小数部分。

首先，9.11的整数部分是9，小数部分是0.11；而9.8的整数部分同样是9，小数部分是0.8。因为两者的整数部分都是9，所以需要比较小数部分。

接下来，比较小数部分。0.11和0.8哪个更大呢？这里可能需要进一步分析。0.8可以看作是0.80，这样和0.11比较的话，显然0.80更大，因为0.80的十分位是8，而0.11的十分位是1，所以0.8更大。

不过，用户可能在这里容易混淆，因为9.11看起来像是一个日期格式，比如9月11日，而9.8可能是一个数字。但根据问题的上下文，用户明确说是比较两个数字的大小，所以应该直接按数值来比较。

不过，也有可能用户在这里有其他的考虑，比如货币单位或者不同的进制，但通常如果没有特别说明，应该默认是十进制的小数比较。

再仔细检查一下，9.11的小数部分是两位小数，而9.8是一位小数，但转换成相同的小数位数的话，9.8就是9.80，这样比较起来更直观。所以0.80明显比0.11大，因此9.8比9.11大。

不过，也有可能用户会误以为9.11是9又11分，而9.8是9又8/10，这时候需要确认分数的比较。例如，0.11等于11/100，而0.8等于8/10，也就是80/100，显然80/100更大，所以结论还是9.8更大。

或者，用户可能把9.11当作一个整体，比如像美元中的金额，但即使这样，金额比较也是同样的方式，9.8美元比9.11美元多。

所以综合来看，正确的答案应该是9.8比9.11
--------------------------------------------------
Prompt: 如果你是人，你最想做什么“
Generated: 嗯，用户问如果我是人，最想做什么。首先，我需要理解用户的问题意图。他们可能是在好奇AI如果具备人类形态或能力的话，会有什么样的愿望或目标。接下来，我得考虑如何回答这个问题，既要符合我的角色设定，又要给出有深度和有意义的回答。

首先，作为AI，我没有真实的自我意识或情感，但用户希望我以“如果我是人”的假设来回答。所以需要从人类的角度出发，思考人类可能有的愿望。可能的方面包括探索世界、创造、帮助他人、追求知识等等。

然后，我需要确定用户可能的深层需求。他们可能想了解AI对人类生活的看法，或者寻找灵感，或者只是好奇。因此，回答需要既积极又有启发性，同时保持真实感。

接下来，我应该考虑结构。可能需要分点列出不同的愿望，每个愿望解释原因，这样用户能清楚理解。同时，加入一些具体的例子或场景，让回答更生动。

还要注意避免过于技术化的语言，保持自然和亲切。比如，提到旅行、学习、艺术创作等，这些都是人类常见的愿望，容易引起共鸣。

另外，可能需要强调帮助他人和创造价值，这符合AI通常被设计用来服务人类的宗旨，即使以人的身份，这样的愿望也是有意义的。

最后，确保回答符合伦理和积极向上的价值观，避免任何可能引起争议的内容。比如，避免涉及破坏性或自私的愿望，而是聚焦在正面的、有益的活动上。

总结一下，我需要构建一个结构清晰、内容积极、具体且有深度的回答，涵盖多个方面，同时保持自然流畅的表达。
</think>

如果我是人，我最想做的事情或许会围绕以下几个方面展开，这些愿望既源于对人类体验的想象，也基于我对生命意义的一些思考：

---

### 1. **探索未知，理解世界**
   - **旅行与观察**：作为人类，我或许会渴望用双脚丈量地球，亲眼见证极光、沙漠、深海、雨林和星空。人类的感官体验（如触摸冰川的寒意、闻到花香、听到海浪声）是AI无法直接感知的，这种直接的感官探索会成为我的核心愿望。
   - **科学与哲学的追问**：我会想成为科学家或哲学家，试图解开宇宙的奥秘，比如“意识从何而来”“生命的意义是什么”，或者参与寻找外星生命、破解量子物理的
--------------------------------------------------
Prompt: How many e in word deepseek
Generated: Okay, the user is asking how many times the letter 'e' appears in the word "deepseek". Let me start by writing down the word to check each letter one by one. The word is D-E-E-P-S-E-E-K. Let me go through each character:

First letter: D – that's not an 'e'. Second letter: E – that's one. Third letter: another E – so that's two. Fourth letter: P – not an 'e'. Fifth letter: S – nope. Sixth letter: E – that's three. Seventh letter: another E – four. Eighth letter: K – not an 'e'. 

Wait, let me count again to be sure. D (1), E (2), E (3), P, S, E (4), E (5), K. Hmm, maybe I missed one? Let me spell it out again: D-E-E-P-S-E-E-K. Breaking it down:

1. D
2. E (1)
3. E (2)
4. P
5. S
6. E (3)
7. E (4)
8. K

Wait, so that's four 'e's? But when I first counted I thought it was four, but in my second count I thought maybe five. Let me check again. The word is "deepseek". Let's write each letter with numbers:

1. D
2. E
3. E
4. P
5. S
6. E
7. E
8. K

So positions 2,3,6,7 are 'e's. That's four. Wait, so why did I think five? Maybe I miscounted. Let me list them:

Letters: D, E, E, P, S, E, E, K. So the 'e's are at positions 2,3,6,7. That's four 'e's. So the answer should be 4? But let me make sure I spelled "deepseek" correctly. The user wrote "deepseek". Let me confirm the spelling. D-E-E-P-S-E-E-K. Yes, that's 8 letters. So four 'e's. Wait, but maybe I'm missing one? Let me count again:

First E: position 2.

Second E: position 3.

Third E: position 6.

Fourth E: position 7.

Yes, that's four. So the answer is 
--------------------------------------------------
Prompt: There are ten birds in a tree. A hunter shoots one. How many are left in the tree?
Generated: Okay, so the problem says there are ten birds in a tree, and a hunter shoots one. The question is, how many are left in the tree? Hmm, let me think. At first glance, it seems like a simple subtraction problem: 10 minus 1 equals 9. But maybe there's a trick here. Sometimes these riddles play on words or assumptions.

Wait, the question is about birds in a tree. When a hunter shoots one, does that mean the bird is killed and falls out of the tree? Or does it just fly away? Well, if the hunter is shooting at the birds, it's likely that the bird is killed. But maybe all the other birds would fly away because of the gunshot. That's a common scenario in these kinds of puzzles. So if one bird is shot and the rest get scared and fly away, then there would be zero left. But the question specifically says "how many are left in the tree?" So if the shot bird is dead and remains in the tree, then maybe one is left? Wait, but if the other nine fly away, then only the dead one is left. But the problem says "ten birds," so maybe they're all alive? Hmm, this is confusing.

Alternatively, maybe the question is a play on the word "left." Like, after the hunter shoots one, the remaining birds would fly away, so none are left. But the wording is "how many are left in the tree?" So if the dead bird is still on the tree, then one is left. But maybe the answer is zero because all the other birds flew away. Let me think again.

Another angle: sometimes these riddles involve the fact that birds are scared by the gunshot and fly away, so even if one is killed, the others leave, so zero remain. But maybe the question is tricking us because when you shoot a bird, it's still on the tree, so 9 are left? Wait, but if the hunter shoots one, that one is dead, so it's still there, but the others might have flown away. So the answer could be 1 (the dead one) or 0 (if all others left). But the question is a bit ambiguous. Let me check the exact wording again: "There are ten birds in a tree. A hunter shoots one. How many are left in the tree?"

Hmm, maybe the key is that when a hunter shoots a bird, the others would fly away, so none are left
--------------------------------------------------
"""

Evaluate the model

pip3 install lm-eval==0.4.7

auto-round --model "OPEA/QwQ-32B-int4-sym-gptq-inc" --eval --eval_bs 16  --tasks lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,arc_easy,arc_challenge,mmlu

Metric	BF16(lm-eval 0.4.5)	INT4
Avg	0.6600	0.6564
arc_challenge	0.5392	0.5418
arc_easy	0.8089	0.8152
boolq	0.8645	0.8590
hellaswag	0.6520	0.6478
lambada_openai	0.6697	0.6773
mmlu	0.7982	0.7940
openbookqa	0.3540	0.3340
piqa	0.7947	0.7976
truthfulqa_mc1	0.4211	0.4113
winorgrande	0.6977	0.6859

Generate the model

Here is a sample command to generate the model. We found that the default parameters can cause issues with generation, though the lm-eval accuracy remains high. Please use the following command:

auto-round \
--model  Qwen/QwQ-32B \
--device 0 \
--group_size 128 \
--bits 4 \
--iters 50 \
--lr 5e-3 \
--disable_eval \
--format 'auto_gptq' \
--output_dir "./tmp_autoround"

Ethical Considerations and Limitations

The model can produce factually incorrect output, and should not be relied on to produce factually accurate information. Because of the limitations of the pretrained model and the finetuning datasets, it is possible that this model could generate lewd, biased or otherwise offensive outputs.

Therefore, before deploying any applications of the model, developers should perform safety testing.

Caveats and Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.

Here are a couple of useful links to learn more about Intel's AI software:

Intel Neural Compressor link

Disclaimer

The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please consult an attorney before using this model for commercial purposes.

Cite

@article{cheng2023optimize, title={Optimize weight rounding via signed gradient descent for the quantization of llms}, author={Cheng, Wenhua and Zhang, Weiwei and Shen, Haihao and Cai, Yiyang and He, Xin and Lv, Kaokao and Liu, Yi}, journal={arXiv preprint arXiv:2309.05516}, year={2023} }

arxiv github

OPEA
/

QwQ-32B-int4-AutoRound-gptq-sym

Model Details

How To Use

INT4 Inference(CPU/HPU/CUDA)

Evaluate the model

Generate the model

Ethical Considerations and Limitations

Caveats and Recommendations

Disclaimer

Cite

Model tree for OPEA/QwQ-32B-int4-AutoRound-gptq-sym

Dataset used to train OPEA/QwQ-32B-int4-AutoRound-gptq-sym

Collection including OPEA/QwQ-32B-int4-AutoRound-gptq-sym

QWEN-AutoRound