--- tags: - text-generation-inference - transformers - unsloth - gguf - reasoning - Qwen2 - Qwen license: apache-2.0 language: - en pipeline_tag: text-generation --- ![BY_PINKSTACK.png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/2xMulpuSlZ3C1vpGgsAYi.png) # 🧀 Which quant is right for you? - ***Q4:*** This model should be used on edge devices like phones or older laptops due to its very compact size, quality is okay but fully usable. - ***Q6:*** This model should be used on most medium range devices like a gtx 1650, good quality and fast responses. - ***Q8:*** This model should be used on most modern devices Responses are very high quality, but its slower than q6 ## Things you should be aware of when using PARM models (Pinkstack Accuracy Reasoning Models) 🧀 This PARM is based on Qwen 2.5 3B which has gotten extra reasoning training parameters so it would have similar outputs to qwen QwQ / O.1 mini (only much, smaller.), We trained using [this](https://huggingface.co/datasets/gghfez/QwQ-LongCoT-130K-cleaned) dataset. it is designed to run on any device, from your phone to high-end PC. To use this model, you must use a service which supports the GGUF file format. Additionaly, this is the Prompt Template, it uses the qwen2 template. ``` {{ if .System }}<|system|> {{ .System }}<|end|> {{ end }}{{ if .Prompt }}<|user|> {{ .Prompt }}<|end|> {{ end }}<|assistant|> {{ .Response }}<|end|> ``` Or if you are using an anti prompt: <|end|><|assistant|> Highly recommended to use with a system prompt. # Extra information - **Developed by:** Pinkstack - **License:** apache-2.0 - **Finetuned from model :** unsloth/qwen2.5-3b-instruct-bnb-4bit This model was trained using [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. Used this model? Don't forget to leave a like :) [](https://github.com/unslothai/unsloth)