A newer version of this model is available: Pinkstack/PARM-V2-QwQ-Qwen-2.5-o1-3B-GGUF

πŸ§€ New Parm version of this model, that is higher quality: https://huggingface.co/Pinkstack/PARM-V2-QwQ-Qwen-2.5-o1-3B-GGUF

BY_PINKSTACK.png

πŸ§€ Which quant is right for you?

  • Q4: This model should be used on edge devices like phones or older laptops due to its very compact size, quality is okay but fully usable.
  • Q6: This model should be used on most medium range devices like a gtx 1650, good quality and fast responses.
  • Q8: This model should be used on most modern devices Responses are very high quality, but its slower than q6

Things you should be aware of when using PARM models (Pinkstack Accuracy Reasoning Models) πŸ§€

This PARM is based on Qwen 2.5 3B which has gotten extra reasoning training parameters so it would have similar outputs to qwen QwQ / O.1 mini (only much, smaller.), We trained using this dataset. it is designed to run on any device, from your phone to high-end PC.

To use this model, you must use a service which supports the GGUF file format. Additionaly, this is the Prompt Template, it uses the qwen2 template.

{{ if .System }}<|system|>
{{ .System }}<|end|>
{{ end }}{{ if .Prompt }}<|user|>
{{ .Prompt }}<|end|>
{{ end }}<|assistant|>
{{ .Response }}<|end|>

Or if you are using an anti prompt: <|end|><|assistant|>

Highly recommended to use with a system prompt.

Extra information

  • Developed by: Pinkstack
  • License: apache-2.0
  • Finetuned from model : unsloth/qwen2.5-3b-instruct-bnb-4bit

This model was trained using Unsloth and Huggingface's TRL library.

Used this model? Don't forget to leave a like :)

Downloads last month
105
GGUF
Model size
3.09B params
Architecture
qwen2

4-bit

6-bit

8-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.