Pinkstack's picture
Update README.md
f804f53 verified
|
raw
history blame
2.22 kB
metadata
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - gguf
  - reasoning
  - Qwen2
  - Qwen
license: apache-2.0
language:
  - en
pipeline_tag: text-generation
new_version: Pinkstack/PARM-V2-QwQ-Qwen-2.5-o1-3B-GGUF

πŸ§€ New Parm version of this model, that is higher quality: https://huggingface.co/Pinkstack/PARM-V2-QwQ-Qwen-2.5-o1-3B-GGUF

BY_PINKSTACK.png

πŸ§€ Which quant is right for you?

  • Q4: This model should be used on edge devices like phones or older laptops due to its very compact size, quality is okay but fully usable.
  • Q6: This model should be used on most medium range devices like a gtx 1650, good quality and fast responses.
  • Q8: This model should be used on most modern devices Responses are very high quality, but its slower than q6

Things you should be aware of when using PARM models (Pinkstack Accuracy Reasoning Models) πŸ§€

This PARM is based on Qwen 2.5 3B which has gotten extra reasoning training parameters so it would have similar outputs to qwen QwQ / O.1 mini (only much, smaller.), We trained using this dataset. it is designed to run on any device, from your phone to high-end PC.

To use this model, you must use a service which supports the GGUF file format. Additionaly, this is the Prompt Template, it uses the qwen2 template.

{{ if .System }}<|system|>
{{ .System }}<|end|>
{{ end }}{{ if .Prompt }}<|user|>
{{ .Prompt }}<|end|>
{{ end }}<|assistant|>
{{ .Response }}<|end|>

Or if you are using an anti prompt: <|end|><|assistant|>

Highly recommended to use with a system prompt.

Extra information

  • Developed by: Pinkstack
  • License: apache-2.0
  • Finetuned from model : unsloth/qwen2.5-3b-instruct-bnb-4bit

This model was trained using Unsloth and Huggingface's TRL library.

Used this model? Don't forget to leave a like :)