BY_PINKSTACK.png

PRAM V2

πŸ§€ Which quant is right for you?

  • Q4: This model should be used for super low end devices like older phones or older laptops due to its very compact size, quality is okay but fully usable.
  • Q6: This model should be used on most modern devices, good quality and very quick responses.
  • Q8: This model should be used on most modern devices Responses are very high quality, but its a little slower than q6
  • BF16: This Lossless model should only be used if maximum quality is needed; it doesn't perform well speed wise, but text results are high quality.

Things you should be aware of when using PARM models (Pinkstack Accuracy Reasoning Models) πŸ§€

This PARM is based on Qwen 2.5 0.5B which has gotten extra reasoning training parameters so it would have similar outputs to qwen QwQ (only much, smaller.), We trained with this dataset. it is designed to run on any device, from your phone to high-end PC. that is why we've included a BF16 quant.

To use this model, you must use a service which supports the GGUF file format. Additionaly, this is the Prompt Template, it uses the qwen2 template.

{{ if .System }}<|system|>
{{ .System }}<|end|>
{{ end }}{{ if .Prompt }}<|user|>
{{ .Prompt }}<|end|>
{{ end }}<|assistant|>
{{ .Response }}<|end|>

Or if you are using an anti prompt: <|end|><|assistant|>

Highly recommended to use with a system prompt.

Extra information

  • Developed by: Pinkstack
  • License: apache-2.0
  • Finetuned from model : unsloth/qwen2.5-0.5b-instruct-bnb-4bit

This model was trained using Unsloth and Huggingface's TRL library.

Used this model? Don't forget to leave a like :)

Downloads last month
110
GGUF
Model size
494M params
Architecture
qwen2

4-bit

6-bit

8-bit

16-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.